LINEAR 
ALGEBRA 


CORE TOPICS FOR THE 
SECOND COURSE 


Dragu Atanasiu = 
Piotr Mikusinski 


e 
Ve World Scientific 


LINEAR 
ALGEBRA 


CORE TOPICS FOR THE 
SECOND COURSE 


LINEAR 
ALGEBRA 


CORE TOPICS FOR THE 
SECOND COURSE 


Dragu Atanasiu 


University of Boras, Sweden 


Piotr Mikusinski 


University of Central Florida, USA 


Ve World Scientific 


NEW JERSEY + LONDON + SINGAPORE + BEIJING + SHANGHAI + HONG KONG + TAIPEI + CHENNAI + TOKYO 


Published by 


World Scientific Publishing Co. Pte. Ltd. 

5 Toh Tuck Link, Singapore 596224 

USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE 


Library of Congress Control Number: 2022057625 


British Library Cataloguing-in-Publication Data 
A catalogue record for this book is available from the British Library. 


LINEAR ALGEBRA 
Core Topics for the Second Course 


Copyright © 2023 by World Scientific Publishing Co. Pte. Ltd. 


All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or 
mechanical, including photocopying, recording or any information storage and retrieval system now known or to 
be invented, without written permission from the publisher. 


For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, 
Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from 
the publisher. 


ISBN 978-981-125-854-1 (hardcover) 
ISBN 978-981-125-855-8 (ebook for institutions) 
ISBN 978-98 1-125-856-5 (ebook for individuals) 


For any available supplementary material, please visit 
https://www.worldscientific.com/worldscibooks/10.1142/12898#t=suppl 


Printed in Singapore 


Preface 


This is a book for a second course in linear algebra. 

In order to facilitate a smooth transition to the world of rigorous proofs we 
mix, to a greater extent than most textbooks at this level, abstract theory with 
matrix calculations. We present numerous examples and proofs of particular 
cases of important results before the general versions are formulated and proved. 
We noticed in many years of teaching that a proof of a particular case which 
captures the main idea of the general theorem has a major impact on the depth 
of understanding of the general theory. Reading simpler and more manageable 
proofs is also more likely to encourage students to work with proofs. Students 
can try to prove another particular case or the general case using the knowledge 
gained from the particular case. For some theorems we give two or even three 
proofs. In this way we give students an opportunity to see important results 
from different angles and at the same time to see connections between different 
results presented in the book. 

Students are assumed to be familiar with calculations with real matrices. 
For example, students should be able to calculate products of matrices and 
find the reduced row echelon form of a matrix. All this background material is 
presented in our book Linear Algebra, Core topics for the first course, but the 
present book does not assume that students are familiar with our presentation 
of that material. Any standard book on Matrix Linear Algebra will provide a 
sufficient preparation for the present volume. 

On the other hand, since most material of this book mirrors, at a higher, less 
computational, and more abstract level, the content of our book Linear Algebra, 
Core topics for the first course, students who find a result from this book too 
abstract would benefit from reading the same material from the first course. For 
example, in Core topics for the first course there are a lot of concrete examples 
of Jordan forms and singular value decompositions. Getting familiar with those 
examples would make the theory presented in the second course more accessible. 

The majority of results are presented under the assumption that the vector 
spaces are finite dimensional. Some examples of infinite dimensional spaces are 
given and a very brief discussion of infinite dimensional inner product spaces is 
included as Appendix D. 

In Chapter 1 we introduce vector spaces and discuss some basic ideas in- 
cluding subspaces, linear independence, bases, dimension, and direct sums. We 
consider both real and complex vector spaces. For students with limited ex- 


vi PREFACE 


perience with complex numbers we provide an appendix that presents complex 
numbers in an elementary and detailed manner. 

In Chapter 2 we discuss linear transformations between vector spaces. The 
presented topics include projections, the Rank-Nullity Theorem, isomorphisms, 
dual spaces, matrix representation of linear transformations, and quotient 
spaces. In order to keep this chapter at a reasonable size we only prove the 
results that are used in later chapters and give more results as exercises. 

In Chapter 3 we discuss inner product spaces, including orthogonal pro- 
jections, self-adjoint, normal, unitary, orthogonal, and positive linear transfor- 
mation. A careful presentation of spectral theorems and the singular value 
decomposition constitute a substantial part of this chapter. Since we determine 
eigenvalues without using characteristic polynomials, determinants are used in 
Chapter 3 only in some examples and exercises. 

In Chapter 4 we show how to obtain bases such that the matrix of a lin- 
ear transformation becomes diagonal or block-diagonal. In order to construct 
such bases, we study factorization of characteristic polynomials. In order to 
give interesting examples we need to calculate characteristic polynomials us- 
ing determinants of endomorphisms. These determinants are introduced at the 
beginning of Chapter 4 via multilinear algebra. 

It is possible to obtain diagonal and block-diagonal matrices without using 
determinants, as shown in [2], but in our opinion the discussion of determinants 
in connection with alternating multilinear forms is an interesting part of linear 
algebra. While the theory without determinants has a certain appeal, when it 
comes to determining diagonal and block-diagonal forms in concrete cases one 
is limited to very simple examples where calculating the eigenvalues is trivial, 
as can be seen in books where determinants are avoided. We believe that stu- 
dents should have all possible instruments to solve problems and determinants 
are essential in determining the characteristic polynomials. It has been our ex- 
perience that by presenting topics in a less theoretical way and showing more 
concrete calculations using determinants increase the understanding of the pre- 
sented topics. Since every student taking a proof based course in linear algebra 
has some knowledge of determinants, the presence of determinants in this book 
helps students make connections with more elementary courses. Moreover, in 
our book we do not calculate determinants as in matrix linear algebra or pre- 
calculus, but instead we emphasize the connection to multilinear algebra. If 
there are reasons to dislike determinants because calculating determinants is 
tedious and non-intuitive, it is not a reason to not appreciate the elegance of 
the alternating multilinear forms and eliminate them in a first course of linear 
algebra with proofs. 

Appendices that provide short introductions to permutations, complex num- 
bers, and polynomials are included at the end of the book. Proofs of all results 
presented in these appendices are included. 

A complete solution manual is available upon request for all instructors who 
adopt this book as a course text. Please send your request to salesQwspc.com. 


Contents 


Preface 


1 Vector Spaces 


1.1 Definitions and examples....................00-4 
1:2. “Subspaces! 2 2.4 see ade ee oe a ed a oe ee 
1.3 Linearly independent vectors and bases .............. 
1a’, (Direct:sumsi. 00s .5)8 fof eG bce a bo ee de be aed 
1.5 Dimension of a vector space... ........2.-00 20000 | 
16° s Change of basis .: 06g 6, eB ace a Yoh ea 6 eS 
Lif? TUXCPGiséS: snc S Sawa ia aye Boe RR ark Meee Ese gh en aes 
1.7.1 Definitions and examples .................4. 
Tf “Subspaces? si 6. a ee ee oe Ro ee eek ale 
1.7.3 Linearly independent vectors and bases .......... 
1.GA> DireCt:sums: 005 Guin we eh aa eo ee bo 
1.7.5 Dimension of a vector space... ........2..2004 
1.7.6 Change ofbasis.............. 0.052.500 0% 
2 Linear Transformations 
2) -Basic¢ properties: i's ako Bk A A ee we SR es, 
2.1.1 The kernel and range of a linear transformation. ..... 
2:122- Projections. 6.4.4.3. % 2% ko ee Be wee a a ts 
2.1.3 The Rank-Nullity Theorem ...............0.. 
22. Asomorphisims'. £ ssivangtsod dk Bota, Se ee a ees 
2.3 Linear transformations and matrices ................ 
2.3.1 The matrix of a linear transformation ........... 
2.3.2 The isomorphism between Myx m(K) and L(V,W)... . 
QAY SDUAGY™ Sisco ca cee odn cin ot a ia Oe eS Wiel hae ne OG 
2.4.1 The dualspace ................0202.00 004 
24.2 The bidual v2..2..0% 2 65 bee ee a ee ee a 
2.5 Quotient spaces... 2... 2... 0.0.00 ee ee 
236° “EX€TCISeS he ed a oe ee Ra a Pe 
2.6.1 Basic properties ............2. 5200028 eee 
2.6.2. Isomorphisms....00 4, 2. ee ee 
2.6.3 Linear transformations and matrices ............ 
QO, Duality coc veto tok fa ee ea hed te ee RIO he eed 


vii 


vill CONTENTS 


2.6.5 Quotient spaces... 2.0.0.0... 0.00.00 0000. 111 
3 Inner Product Spaces 113 
3.1 Definitions and examples...................000.4 113 
3.2. Orthogonal projections... 2... 2... ... 0.2.2... 000004 128 
3.2.1 Orthogonal projections on lines... ..........0.. 128 
3.2.2. Orthogonal projections on arbitrary subspaces ..... . 133 
3.2.3 Calculations and applications of orthogonal projections . 139 
3.2.4 The annihilator and the orthogonal complement .... . 143 
3.2.5 The Gram-Schmidt orthogonalization process and 
orthonormal bases ...............-.0000048 147 
3.3 The adjoint of a linear transformation .............0.. 154 
3.4 Spectral theorems ............ 0.0000 eee eee 163 
3.4.1 Spectral theorems for operators on complex inner 
PFOUGt Spaces= are aici see A ae Geos ee 163 
3.4.2  Self-adjoint operators on real inner product spaces .... 177 
3.4.3 Unitary operators ............ 02.20.0000. 180 
3.4.4 Orthogonal operators on real inner product spaces .... 187 
3.4.5 Positive operators ............. 000-000 00- 192 
3.5 Singular value decomposition ................0204 200 
3:6% VEXereises ve etal ah el AM ae a eS wed A SA wt 213 
3.6.1 Definitions and examples .................. 213 
3.6.2 Orthogonal projections... ..............000. 214 
3.6.3 The adjoint of a linear transformation ........... 215 
3.6.4 Spectral theorems .................0004. 216 
3.6.5 Singular value decomposition ................ 218 
4 Reduction of Endomorphisms 221 
4.1 Eigenvalues and diagonalization. .................. 221 
4.1.1 Multilinear alternating forms and determinants ...... 221 
4.1.2 Diagonalization..........0....0. 0.20000. 234 
4.2 Jordan canonical form .................2. 200004 252 
4.2.1 Jordan canonical form when the characteristic 
polynomial has one root... ............000.4 252 
4.2.2 Uniqueness of the Jordan canonical form when the 
characteristic polynomial has one root ........... 274 
4.2.3 Jordan canonical form when the characteristic 
polynomial has several roots. ...............-. 277 
4.3 Therationalform............ 02.0000 0000 eee 280 
BA™ PIXCtCISCS® * erscects a ai mere ohn Boda oe Oe os A eae Biden 290 
4.4.1 Diagonalization....................000., 290 
4.4.2 Jordan canonicalform ..................0.. 294 


AAs Rational form: cat. 8.6-8 o k  a ek A U o E IOS 297 


CONTENTS 


5 Appendices 
Appendix A 
Appendix B 
Appendix C 
Appendix D 


Bibliography 


Index 


Permtttations: -..ac5 4 be ee Be a SH 
Complex numbers ................+0+.2-. 


Polynomials 


Infinite dimensional inner product spaces ........ 


This page intentionally left blank 


Chapter 1 


Vector Spaces 


Introduction 


If you are reading this book, you most likely worked with vectors in a number 
of courses and you have fairly good intuitive understanding of what we mean 
by a vector. But can you give a formal definition of a vector? It turns out that 
the best we can do is to define a vector as an element of a vector space. It may 
seem a silly definition, but actually it represents what is quite common in more 
advanced mathematics. The idea is that, when we want to describe a certain 
class of objects, we don’t want to describe the objects themselves, but rather 
what we can do with them and what are the properties of those operations on 
the objects we are defining. Describing the objects directly limits applications 
of the methods we develop to the instances we describe. On the other hand, if 
formulate something about any object that has certain properties, then it will 
apply to examples that we may not even be aware of. 

We use this approach in our definition of vector spaces. We define a vector 
space as a collection of objects not of a certain kind, but rather objects on which 
certain operations having certain operational properties can be performed. 


1.1 Definitions and examples 


A vector space is a set with an algebraic structure. Elements of a vector space 
can be added and scaled. Addition in a vector space V is a function that assigns 
to any v,w € V a unique element v-+w € VY. An element v € V is scaled if it is 
multiplied by a number, called scalar in this context. The result of multiplying 
v € Y by ascalar c is denoted by cv. In this book scalars are either real numbers 
or complex numbers. If it is necessary to specify which case it is, we write a 
“real vector space” or a “complex vector space”. If a statement applies to both 
real vector spaces and complex vector spaces, we use the letter K instead of R 
or C. The formal definition of vector spaces given below is an example of such a 
situation. The set K is called a “scalar field”. The phrase “vector space over K” 


2 Chapter 1: Vector Spaces 


is often used to indicate that K is the set of scalars for that vector space. 

As explained in the introduction, addition and scaling in the definition of a 
vector space below are not specific operations, but any operations that satisfy 
the listed conditions. In most applications we already know what we mean by 
addition and scaling. In order to use the tools of vector spaces we have to make 
sure that all conditions in the definition are satisfied. After the definition we 
discuss some examples where all conditions are satisfied as well as some examples 
where some conditions are not satisfied. 


Definition 1.1.1. By a vector space we mean a set V with addition 
that assigns a unique element v + w € V to any v,w € Y and scalar 
multiplication that assigns a unique element cv € VY to any c € K and 
any v € V in such a way that all of the following conditions are satisfied: 


For every v, w € V we have v+w=w+vyv; 
For every u,v, w € V we have u+ (v+ w) = (u+w)+ Vv; 


There is an element O € V such that for every v € V we have 
O+v=v; 


For every v € V there is u € V such that v + u = 0; 


For every v € V we have lv = v; 


For every v € V and every c1, C2 € K we have (c1c2)v = c1(c2v); 


For every v € V and every ci, c2 € K we have (c,+c2)v = c1v+c2Vv; 


For every v,w € V and every c € K we have c(v + w) = cv + cw. 


Now we present some examples of vector spaces. We do not verify that 
all conditions in the definition of a vector space are satisfied. While we do not 
expect that you will verify every condition in every example, you should convince 
yourself that they are satisfied. It is a good exercise to give formal proofs for 
some conditions in some examples, especially if they don’t seem obvious. 


Example 1.1.2. The set of all real numbers R with the standard addition 
and multiplication is a real vector space. Note that in this example the vector 
space and the scalar field are the same set. 

The set of all complex numbers C is an example of a complex vector space 
as well as a real vector space. 


1.1. DEFINITIONS AND EXAMPLES 3 


a1 
Example 1.1.3. For every integer n > 1 the set of all n x 1 matrices | : 

an 
with a,,...,@, € K, denoted by K”, is a vector space with the operations of 


addition of vectors and multiplication of vectors by scalars defined by 


ay by ay, + by ay, Cay, 
+] if = : and cc]: |= 


Qn bn Qn + bn (hr Can, 


for c€ K. 
Clearly, C” is a vector space over C and R” is a vector space over R. It is 
also possible to consider C” as a vector space over R. 


Example 1.1.4. Let V = {a}, a set with a single element a. With the 
operations defined as 


a+a=a and ca=a, 


for any c € K, it is a vector space. 

Note that we must have a = 0, where 0 is the element whose existence is 
guaranteed by condition (c) in Definition 1.1.1. This is the smallest possible 
vector space. It is often called the trivial vector space. 


Example 1.1.5. Let S be an arbitrary nonempty set. The set of all functions 
f : S > K with addition defined by 


(f +. 9)(s) = f(s) + g(s) for every s € S 
and multiplication by scalars defined by 
(cf)(s) = cf(s) for every se S andce K 


is a vector space over K. We will denote this space by Fx(S). Note that 
the constant function f(s) = 0 is the O of this vector space and the function 
g(s) = —f(s) is the element of Fx(S) such that f + g = 0. 

This is a very important family of vector spaces. Several examples below 
are special cases of Fx(.S). If you verify that all conditions in the definition of 
a vector space are satisfied for Fx(S), then there is no need to check them for 
those examples. 


4 Chapter 1: Vector Spaces 


Example 1.1.6. We denote by Mm xn(K) the set of all m x n matrices with 
entries from K. If A and B are matrices from Mm xn(K) such that the (j,k) 
entry of the matrix A is aj, and the (j,k) entry of the matrix B is 6;, and c 
is a number form K, then A+ B is the matrix with the (j,k) entry equal to 
ajn +6;, and cA is the matrix with the (j,k) entry equal to ca;,. It is easy to 
verify that with these operations Minxn(K) is a vector space over K. 

Note that the set Mm xn(K) can be interpreted as the space Fx(S) where 


ey ALPE nae eel Pa caniile — UGh tn) yp ee on tsonit — Ire a alte 


Example 1.1.7. The set of all infinite sequences (a) = (41, ¥2,...) of real 
numbers can be identified with Fp({1,2,3,...}) and thus it is a real vector 
space. Similarly, the set of all infinite sequences of complex numbers is a com- 
plex vector space. The operations defined in Example 1.1.5 can be described 
in a more intuitive way: 


(Gaia ee) iy ay — a ee a) 


and 
Giles i —s(CLanexone): 


In some applications it is natural to consider the vector space Fx(Z), where 
Z is the set of all integers, of all “two-sided” sequences 


(oe ao eer egy Bian) 


of real or complex numbers. 


Example 1.1.8. We will define a vector space over R whose elements are 
lines of R? parallel to a given line. We recall that a line is a set of the form 
x + Ra = {x+ca:c€ R} where x,a € R? anda #0. 

Let a be a fixed nonzero vector in R?. We define 


xX =x-+Ra. 


In other words, X is the line through x parallel to a. First note that 0 =Ra. 
Moreover, 


x =y if and only if there is a real number a such that y = x + aa. 


In other words 
X=y ifandonlyif yex. 


1.1. DEFINITIONS AND EXAMPLES 5 


Indeed, if X = y, then 
x+Ra=x=y=y-+Ra, 


and thus y = y + 0a= x+ aa for some a € R. 
On the other hand, if y = x + aa for some a € R, then 


¥=y+Ra=x+aa+Ra=x+Ra=X, 


because aa + Ra = Ra. 
Now we define 
X+Y¥=x+y and K=c&. 
These operations are well defined because 


(x+aa) + (y+ fa)=(x+y)+(at+pjaexty 


and 


c(x + aa) = cx + (caja € CX, 


for arbitrary real numbers a, 6 and c. 
The set V = {X : x € R?} with the operations defined above is a real vector 
space. 


Example 1.1.9. Let A = {(x,x) : x € R} and B = {(x,—-x) : x € R}. Show 
that AU B is not a real vector space. 


Solution. It suffices to note that, for example, the vector (2,0) = (1, 1)+(1, -1) 
is not AUB. Note that both A and B are vector spaces. O 


Example 1.1.10. Let A = {(z,y) € R? : y > 0}. Show that A is not a real 
vector space. 


Solution. It suffices to observe that, for example, the vector (1, —4) is in A, but 
the vector (2,8) = —2(1, —4)) is not in A. Note that, if (71, y1), (w2, y2) € A, 
then (1 + %2,y1 + y2) = (£1, yi) + (£2, y2) is in A. O 


Example 1.1.11. Let A = {(z,y,z) € R?: c+y+z=1}. Show that A is 
not a real vector space. 


Solution 1. It suffices to note that the vectors (1,0,0) and (0,1,0) are in A, 
but the vector (1, 1,0) = (1,0,0) + (0,1, 0) is not in A. Oo 


6 Chapter 1: Vector Spaces 


Solution 2. It suffices to note that the vector (1,0,0) is in A, but the vector 
(3, 0,0) = 3(1,0,0) is not in A. oO 


Note that the set V = {(2, y,z) € R? : x+y+z = 0} is a real vector space. 


Example 1.1.12. Let VY; and V2 be arbitrary vector spaces over K. The set 
of all pairs (vi, v2) such that v1 € V; and vo € V2 is denoted by Vy x V2 and 
called the Cartesian product of spaces V, and V2. In symbols, 


V, x Vo = {(v1, v2): vi € Vi, V2 € Vo}. 
It is easy to verify that V1 x V2 becomes a vector space if we define 
(V1, V2) + (Wi, W2) = (vi + Wi1,V2+Wwe) and c(v1, v2) = (evi, eve). 
More generally, for arbitrary vector spaces V),...,V,, over K we define 
Vi x-++X Vn = {(v1,---,Vn) 2 V1 € Vi,---,Vn € Vn}. 


The set Vj x +--+ X Vy, called the Cartesian product of spaces V},...,Vn, is a 
vector space with the addition 


(V1,---;Vn) + (Wi,---;Wn) = (vi + W1,---;Vn + Wr) 
and scalar multiplication 
Evigases in) = (vincent): 
If Vj =---=V, = V, then we write 


Val ese <1 = WV oo Wy 
—— 


n times 


Example 1.1.13. Let V be a real vector space. Show that V x V is a complex 
vector space if we define addition as for the real vector space V x V and 
multiplication by complex numbers as follows 


(a + bi)(x, y) = (ax — by, ay + bx), 
where a and 6 are real numbers and x and y are vectors from VY. 
Solution. We will only verify that 


(c + di)((a + bi)x, y) = ((c + di)(a + bt))(x, y) 


1.1. DEFINITIONS AND EXAMPLES 7 


because the verification of the other axioms is similar and easier. 


(c + dt)((a + bi)x, y)) = 


c+ di)(ax — by, ay + bx) 
c(ax — by) — d(ay + bx), c(ay + bx) + d(ax — by)) 
(ac — db)x — (ad + bc)y, (ac — bd)y) + (ad + bc)x) 
(ac — bd) + (ad + bc)i)(x, y) 

(c+ di)(a+ bi))(x, y). 


( 
( 
( 
( 
( 


oO 


Elements of a vector space are called vectors. The use of the word “vector” 
does not imply that we are talking about the familiar vectors in R” that we 
picture as arrows. In the above examples we considered vectors that were func- 
tions, matrices, sequences, or even sets (Example 1.1.8). The expressions “v is 
an element of a vector space VY” and “v is a vector in a vector space Y” are 
completely equivalent. 

We close this section with a theorem that collects some simple but useful 
properties of addition and scaling in vector spaces. 


Theorem 1.1.14. Let V be a vector space. Then 
(a) The element 0 such that 0+ v = v for every v € V is unique; 
(b) Ov = 0 for every v € V; 


(d) cO = 0 for every c € K; 


) 
) 
(c) Ifv+u=0, then u= (-1)v; 
) 
) 


(e) Ifu+w=vew, thenu=v. 


Proof. To prove (a) assume that there are 01,02 € V such that 0; + v = v and 
02 +v =v for every v € Y. Then we have 


0, = 02+ 0; = 0; + 02 = Oz. 
For every v € V we have 
Ov+v=0v4+1v=(04+1)v=l1v=v. 
Now let u € V be such that v + u= 0. Then Ov + v = v implies 
0=v+u= (0v+v)+u=0v+(v+u) =0v+0 = Ov. 


Thus, 0v = 0 for every v € Y, proving (b). 
To prove (c) we first observe that for every v € V we have 


v+(-1l)v = 1v+ (-1)v = (1-1)v=0v=0. 


8 Chapter 1: Vector Spaces 


It remains to show that (—1)v is the only element with that property. Indeed, 
if u is any element such that v + u = 0, then 


u=u+0=u4(v+ (—1)v) = (u+ v) 4+ (-1)v = 04 (-1)v = (-D)v. 
For any c € K we have 
c0 = c(v + (—1)v) = ev 4 (-c)v = (c—c)v = OV =0, 


where v is an arbitrary element in Y, proving (d). 
Finally, if u-+w = v-+w, then 


u=u+0=u+4 (w+ (—1)w) = (u+w)+(-l)w 
=(v+w)+(-l)w=v+(w+ (-1)w) =v+0=v, 


proving (e). O 


The element (—1)v is denoted simply by —v. With this notation we have 
v-v=-v+v=0. 


Note that —(—v) = v. 


1.2 Subspaces 


The concept of a subspace of a vector space is one of the fundamental ideas of 
linear algebra. We begin the discussion of subspaces by considering an example. 


Example 1.2.1. Let V be a vector space and let x be a vector in V. The set 
of all vectors of the form cx where c € K is a vector space with the addition 
and the multiplication by scalars inherited from the vector space V, that is, 


cx + ¢9x = (cr +c2)x and = ci(cox) = (c1c2)x. 
This vector space is denoted by Kx or Span{x}: 
Kx = Span{x} = {cx:c € K}. 
Without being too precise, one can say that the vector space Span{x} is a 
“small vector space in a large vector space”. It is important that every element 
of the small space is an element of the large space and that the operations of 


addition and scaling in the small space are the same as in the large space. These 
ideas are captured in the following precise definition. 


1.2. SUBSPACES 9 


Definition 1.2.2. A nonempty subset U/ of a vector space V is called a 
subspace of V if the following two conditions are satisfied: 


(a) IfveU andceE K, then cv ec UY; 


(b) Ifv,w EU, thenv+weud. 


The set Span{x} defined in Example 1.2.1 is a subspace of the vector space 
Y. Note that, if we take x = 0, then Span{x} = {0}. Since O is an element in 
every vector space, {0} is a subspace of every vector space. 

From the definition of subspaces it follows that every vector space is a sub- 
space of itself. To exclude these special cases, we add the word “proper”: U is 
a proper subspace of V if U is a subspace of V such that U A V and U F {0}. 

Now we present several examples of subspaces that often appear in applica- 
tions. 


Example 1.2.3. A polynomial is a function p : K > K defined by 
pt) = a0 + at +--+ a,t” 


where ao, @1,.--,@n € K. If a, £0, then n is called the degree of the polyno- 
mial p(t) = a9 + ait +---+a,t” (see Appendix C). 

The set of all polynomials, denoted by P(K), is a subspace of Fx(K). The 
set P,, (KK) of all polynomials of the form ap + ait +---+a,t” is a subspace of 
P(K). If k <n, then P;(K) is a subspace of P,,(K). 


Example 1.2.4. For every integer n > 1 we denote by Dg(R) the set of all 
real-valued n-times differentiable functions defined on R. 

We also denote by Dr(R) the set of all real-valued functions defined on R 
which are n-times differentiable for every integer n > 1. 

Show that Dg (R) is a subspace of Fp(R) for every integer n > 1 and that 
Dr(R) is a subspace of Dg(R) for every integer n > 1. 


Solution. This is an immediate consequence of the fact that the sum of differen- 
tiable functions is differentiable and that a constant multiple of a differentiable 
function is differentiable. O 


Example 1.2.5. Show that the set S,x, of all n x n symmetric matrices 
(matrices such that A? = A) is a subspace of the vector space of the square 
matrices My xn(K). 


10 Chapter 1: Vector Spaces 


Solution. If A and B are symmetric matrices and c € K, then we have 


(A+B)? = AT + BT = A+B and (cA)? =cAT =A. 


Example 1.2.6. Show that the set A,.»(K) of all antisymmetric matrices 
(matrices such that A? = —A) is a subspace of the vector space of square 
matrices My xn(K). 


Solution. If A and B are antisymmetric matrices and c € K, then we have 


(A+B)? = AT + BT =—A—B=-(A+B) 


and 
(eA)? = cA® =c(—A) = —cA. 


oO 
Example 1.2.7. Let A € Mm xn(K). Show that the set 
C(A) = {Ax: x € K"} 
is a subspace of the vector space K””. 
Solution. If y; = Axi, yo = Axo, and c€ K, then 
yityeo = A(xi + x2) and cy = cAx = A(cx). 
O 


Example 1.2.8. Let A € Mimxn(K). Show that the set 
N(A) = {x € K” : Ax = 0} 
is a subspace of the vector space K”. Determine 
Mad I—=% il 


N @2+% a 
2+24 3 14+7% 


1.2. SUBSPACES 


11 


Solution. If Ax; = Axg = Ax = 0 and c€ K, then 


A(x; + X2) = Ax; + Axg =O and A(cx) =c(Ax) =0, 


which proves that N(A) is a subspace of Minn (IK). 
Since the reduced row echelon form of the matrix 


2+4 
a 
2+21 


is 


the solution of the equation 
2+i1-i 
4 2+i 
Qe 3 


is 


1-i 1 
Q+i i 
3 14+ 
2 our 
ia | 
1 er 
3+ 732] > 
0 
1 Z1 0 
OL |x| = 10 
1+i7 23 0 
2 ome 
[13° 
1 De 
In m?'| 9 
1 


where c is an arbitrary complex number. This means that 


Pee th Ae) 1 
42+4 a 
Jeeletat 


N 
2+2% 


ae 
13 
= aly 
= Span ig 


Example 1.2.9. Let a, b, and c be distinct numbers from K and let U be the 


set of polynomials from P,,(IK) such that p(a) = p(b) 


U = {p € Pn(K) : p(a) 


p(c), that is, 


Show that U/ is a subspace of P,,(K). 


Solution. If p1,p2 € U, then 


pi(a) = pi(b) = pi(c) 


and p2(a) = p2(b) = po(c), 


12 Chapter 1: Vector Spaces 


and consequently 


pi(a) + p2(a) = pi(b) + po(b) = pi(c) + pa(c), 


which means that p; + po € U. Similarly, if p< U and k € K, then 


and consequently 


which means that kp € U. 

Note that in our argument we are not using the fact that elements of U/ 
are polynomials. A similar argument can be used to show that, if V is any 
subspace of Fx(S') and $1, 82,...,5n € S, then 


U={feEV: f(si) = fls2) =--- = F(sn)} 
is a subspace of VY. O 
Now we are going to consider some general properties of subspaces. We 


begin with the following simple observation that will help us show that every 
subspace of a vector space is a vector space itself. 


Lemma 1.2.10. If U is a subspace of a vector space V, then 
(a) Ifv EU, then —v EU; 


(b) OEU. 


Proof. To show that (a) holds we note that for any v € V we have 
-v = (-l)v, 


so if v EU, then —v = (-l1)v EU. 
To show that (b) holds we take an arbitrary u € U and note that 


O=u-u=u4(-lued. 


Theorem 1.2.11. A subspace U of a vector space VY is a vector space 


itself. 


1.2. SUBSPACES 13 


Proof. First we note that, by the definition of a subspace, for every v, w € U the 
element v + w is in UY and for every v € U and every number c € K the element 
cv is in U. Moreover, by Lemma 1.2.10, 0 € U and —v € UY for every v € U. 
Finally, the eight conditions in the definition of a vector space are satisfied for 
elements of U/ because they are satisfied for all elements of VY. O 


Definition 1.2.12. Let V,,...,V, be subspaces of a vector space V. 
The set of all possible sums of the form 


Vi Fe Viny 


where vz € Vj,..-,Vn € Vn, is denoted by 


Vipt---+Vn, 
that is, 


YVi+---+Y = {vi t---tvn ivi EVi1,---,Vn © Vn}. 


This operation of “addition of subspaces” has properties similar to ordinary 
addition. 


Theorem 1.2.13. Let U, V, and W be subspaces of a vector space X. 
Then 
UuU+V=V+U 


and 


U+V)+wW=V+U+W). 


Proof. The properties follow immediately from the fact that 
ut+tv=v+t+u and u+(v+w)=(u+w)+v 


for any u,v,w € ¥. O 


While there are some similarities, there are also some differences. For exam- 
ple, in general U + W=V-+W does not imply U = V. 


Theorem 1.2.14. If V1,...,Vn are subspaces of a vector space V, then 


the set Vi +--+ Vy, is a subspace of V. 


14 Chapter 1: Vector Spaces 


Proof. If v € Vy +++++ Vn, then 
VeHvit+-:::+ Vn, 
for some v; € Y;. For any number c € K we have 
ev =c(vi +++ + Vpn) = evi +:°- +eNn, 


which shows that cv € VY; +-:-+ Vp. 
Ifv,weVi+---+ Vp, then 


Vevic-e+tvn and w=wit:::+Wn, 
for some vectors v;,w,; € Vj. Since 
VEWS Vite FV Fwy te FW = VE FW + Vn + Wn, 


we have v+weée Vy +---+ Vp. O 


Definition 1.2.15. Let v1,...,Vvn be elements of a vector space V. Any 
vector of V of the form 


LiVi ts + @nVn, 


where %1,...,2%n € K, is called a linear combination of vj,...,Vn. The 


set of all linear combinations of the vectors v1,...,Vn is denoted by 
Span{vi,...,Vn}, that is, 


Span{vi,...,Vn} = {a@ivi +--+ +anVn i 21,---,0n € K}. 


The set Span{vi,...,Vn} is called the linear span (or simply span) of 
vectors Vj,...,Vn- 


Linear spans play an important role in defining subspaces in vector spaces. 


Theorem 1.2.16. If v1,...,Vn be elements of a vector space V, then 


the set Span{vi,...,Vn} is a subspace of V. 


Proof. Since 
Span{vj,...,Vn} =Kvi +-:-+Kvn, 


the result is a consequence of Theorem 1.2.14. oO 
If U = Span{vi,...,Vvn}, we say that {vi,...,vn} is a spanning set for 
U. Note that Span{vi,...,Vvn} is the smallest vector space containing vectors 


Vi,--+5VWn- 


1.2. SUBSPACES 15 


Example 1.2.17. P,,(K) = Span{1, ¢, t?, ..,t”}. 


Example 1.2.18. The set of all real-valued solutions of the differential equa- 
tion 
y’ — 3y =0 


is a subspace on Da(R). This subspace is Span{e*‘}. 


Example 1.2.19. The set of all real-valued solutions of the differential equa- 
tion 
MN 
7 


is a subspace of the vector space Dg(R). This subspace is Span{cost, sin t}. 


The choice of a spanning set is not unique. For example, we have 
Span{cost, sint} = Span{cost + sin t, cost — sin t}. 


Indeed, if f € Span{cost, sint}, then for some a,b € K we have 


b —b 
f(t) =acost + bsint = = (cost + sint) + © (cost — sint), 


so Span{cost, sint} C Span{cost + sin t, cost — sin ft}. 
Similarly, if g € Span{cost + sint, cost — sint}, then for some c,d € K we 
have 


g(t) = c(cost + sint) + d(cost — sint) = (c+ d) cost + (c— d)sint, 


so Span{cost + sint, cost — sint} C Span{cost, sin ¢}. 
In the remainder of this section we investigate what changes to the spanning 
set leave the spanned subspace unchanged. 


Theorem 1.2.20. Let vi,...,Vn and v be elements of a vector space 
VY. Ifv € Span{vi,...,Vvn}, then 


Span{vi,.--,Vn, v} = Span{vi,...,Vn}. 


Proof. Clearly, Span{vi,...,Vn} C Span{vi,...,Vn,v}. 
Now consider a w € Span{vi,...,Vn,v}. Then there are numbers aj,..., 
Gn,4n41 © K such that 


W = 41V1 +655 + GnVn + An41V. 


16 Chapter 1: Vector Spaces 


Since v € Span{vi,...,vn}, there are numbers 0),...,b, € K such that 


v =bivi +--+ +bnVvn- 


Then 
W = QV, +°°+ + GnVn + Gn41V 
= A4V1 +++ + AnVn + Gn4i(b1V1i Sele bnVn) 
= (ay + Gn4101)V1 Se (dn + An41bn)Vn- 
Thus w € Span{vi,...,Vn}-. Oo 


The next theorem gives us a condition that can be use to check whether two 
sets of vectors span the same subspace. 


Theorem 1.2.21. Let uj,...,ux and vj,...,Vm be elements of a vector 
space V. Then 


Span{uj,...,u,} = Span{vi,...,Vm} 


if and only if 


Uuj,..-,Ux € Span{vi,...,Vm} and vi,...,Vm € Span{uy,..., ug}. 


Proof. If Span{u,,...,ux} = Span{vi,...,Vm}, then clearly we must have 
ui,...,Ux € Span{vi,...,Vm} and vi,...,Vm € Span{uy,..., ug}. 

On the other hand, if u1,...,u, € Span{vi,...,Vm} and vi,...,V¥m € 
Span{uy,...,u,z}, then 


Span{uy,...,Uk,Vi,---,Vm} = Span{uy,..., ug} 
and 
Span{uy,...,Uk,V1,---,;Vm} = Span{vi,..., Vm}, 
by Theorem 1.2.20. Hence Span{uj,...,u,} = Span{vi,...,Vm}. oO 


In the next three corollaries we show that “elementary operations” on the 
vectors V1,.--,Vn do not affect the vector space Span{vj,..., Vn}. 


Corollary 1.2.22. For anyl<i<j <n we have 


Span{vi,...,Vj,---,Va,---, Wn} = Span{vi,...,vi,.-. 


Proof. This is a direct consequence of Theorem 1.2.21. O 


1.3. LINEARLY INDEPENDENT VECTORS AND BASES 17 


While the above lemma says that interchanging the position of two vectors 
in {v1,.--,;Vn} does not affect the span, it follows that writing these vectors in 
any order will have no affect on the span. More precisely, 


Span{vi,.--,Vn} = Span{V,(1); tee :Vo(n)} 


where o is any permutation on {1,...,n}. 


Corollary 1.2.23. For any j € {1,...,n} and any scalar c 4 0 we have 


Span{vi,...,¢Vj,---, Vn} = Span{vi,...,Vvj,---, Vn}. 


Proof. This is a direct consequence of Theorem 1.2.21 and the equality vj; = 
4 (ev;) oO 
c j/* 


Corollary 1.2.24. For any i,j € {1,...,n} and any scalar c we have 


Span{vi,..., vi +cvj,...,Vn} = Span{vi,...,Vi--., Vn}. 


Proof. This is a direct consequence of Theorem 1.2.21 and the equality v; = 
(vi + CV;) — CV;. O 


The operations on the spanning set that do not affect the subspace, described 
in this section, either do not change or increase the number of spanning vectors. 
Clearly, if we can describe a subspace using fewer vectors it would make sense 
to use that smaller spanning set. How can we tell whether it is possible to find 
a smaller spanning set? This question will be addressed in the next section. 


1.3. Linearly independent vectors and bases 


In this section we introduce two concepts that are of basic importance in every 
aspect of linear algebra. In the first definition we describe a property that will 
provide an answer to the question asked at the end of last section, that is, a 
property of a set of vectors v,,...,V» that implies that there is a smaller set of 
vectors spanning the same vector space. 


18 Chapter 1: Vector Spaces 


Definition 1.3.1. Vectors vj,...,Vn in a vector space V are called 
linearly dependent if the equation 


U1V1, +-+++2nVyn = 0 


has a nontrivial solution, that is, a solution such that at least one of the 
numbers 21,...,2%n is different from 0. 


i i 
Example 1.3.2. In the complex vector space C? the vectors |—i|,} i] and 
Oe 
0 
1], are linearity dependent because 
1 
j a i a 0 0 
—~|—-7] —~=] i] 4+ ]1] =]0 
No See) ale No 


Linear dependence of vectors does not depend on the vector space. More pre- 
cisely, if vectors v1,..., Vn are linearly dependent in the space Span{vi,..., Vn}, 
the smallest vector space containing vectors v1,...,Vn, then they are linearly 
dependent in every vector space that contains Span{vi,...,Vvn} as a subspace. 
On the other hand, linear dependence can depend on the scalar field K. For 
example, the vectors 1 and i are linearly dependent in the complex vector space 
C, because i: 1 —i = 0, but they are not linearly dependent in the real vector 
space C. 


Theorem 1.3.3. If one of the vectors v1,...,Vn in a vector space V is 


0, then the vectors v1,...,Vn are linearly dependent. 


Proof. If vz = 0 for some k € {1,...,n}, then we can take x, = 1 and x, = 0 
form #k. With this choice we have 


U1Vi +++: +2nVn = 0, 


which shows that the vectors are linearly dependent. oO 


The next theorem gives a practical criterion for linear dependence of a set 
vectors. 


1.3. LINEARLY INDEPENDENT VECTORS AND BASES 19 


Theorem 1.3.4. A set of n vectors, with n > 2, is linearly dependent 
if and only if at least one of the vectors can be expressed as a linear 
combination of the remaining vectors. In other words, vectors V1,...,Vn; 


with n > 2, are linearly dependent if and only if there is ak € {1,...,n} 
such that 


ve € Span{vi,...,;Vk—-1, Ve+1,---;Vn}- 


Proof. If the vectors v1,...,Vn are linearly dependent, then 


£1V1 +++++2nVn = 0 


where 21,...,% € K and x, 4 0 for some 0 << k <n. Then 
Cy wo 
vp tee tdv, te) + v, =0 
Xk Lk 
and thus 
Ty Lk-1 Tk+1 In 
VS kl — Vet — 9 Ve 
Xk Xk Xk 
But this means that vy, € Span{vi,...,Vk—-1, Vk+1,---;Vn}- 
Conversely, if v;, € Span{vi,...,Vk‘—1, Ve+1,---;WVn} for some0O< k <n, 
then there are a@1,...,@k—1, @k41,---,;@n © K such that 


Vk = Q{Vy Ht + R-1VR-1 + Gk41VR41 $11 + AnVn, 


and thus 

L1V1 +++++2nVn = 0, 
where 2, = Am for meé {1,...,4k-—1,k+1,...,n} and a, = —1, which means 
that the vectors v1,...,Vn are linearly dependent. oO 


Definition 1.3.5. Vectors vi,...,Vn in a vector space V are called 
linearly independent if the only solution of the equation 


£1V1 +--+ +2nVn = 0 


is the trivial solution, that is 7] =---= 2, = 0. 


In other words, vectors are linearly independent if they are not linearly 
dependent. 


20 Chapter 1: Vector Spaces 


Example 1.3.6. Show that the polynomials 1,t,t?,...,¢” are linearly inde- 
pendent in P,,(K). 


Solution. Assume 


to +aitt+-:-+ant” =0 


holds for some numbers 2,...,% € K. 
It result from Appendix C, Theorem 5.15, that 


Lo Ee Oe 


Consequently the polynomials 1,t,t?,...,¢” are linearly independent. O 


The result from the next example gives a characterization of linear indepen- 
dence of the columns of a matrix and is usually proved for matrices with real 
entries in an introductory Matrix Linear Algebra course. This result will be 
used later in this chapter. 


Example 1.3.7. Let A € Mn xn(K) with columns aj,...,a, € K”, that is, 
A= [a ... a,]. Show that the following conditions are equivalent: 


(a) The vectors a1,...,a, are linearly independent; 


(b) The matrix A is invertible; 


zy 
(c) The equation A| : | = 0 has only the trivial solution, that is, the 
Tn 
solution 7] =---=2, = 0. 


Solution. The equivalence of (a) and (c) follows easily from the definition of 
linearly independent vectors. It is also easy to show that (b) implies (c). We 
only show that (c) implies (b). 

Let Ej, be the matrix with the (j,k) entry being 1 and all other entries 
being 0 and let a € K. For j 4k, we denote 


Rj.(a) =I, +aH;, and S;(a) =I, + (a—1)E£;;, 


where I, is the identity n x n matrix. Note that these matrices have the 
following properties: 


1.3. LINEARLY INDEPENDENT VECTORS AND BASES 21 
(i) Rjx(@)A is the matrix obtained from A by adding the kth row multiplied 
by a to the jth row; 
(ii) S;(@)A is the matrix obtained from A by multiplying the jth row by a; 
(iii) Rjx(a) is invertible and Rjx(a)~* = Rjx(—a); 
(iv) If a 4 0, then $;(q) is invertible and $;(a)~' = 9;(a7*). 


Ty 
Now we assume that the equation A | : | = 0 has only the trivial solution. 
In 
Suppose that 
0 0 A1p Qin 
0 1 0 a2p a2.n 
0 0 IL Weiss Cpesiers 
A= 0 O 0 Anp apn ’ 
0 0 0 Qp+1,p Ap+1,n 
0 0 0 Gn-1,p Gn—-1,n 
0 O 0 anp ann 
where p € {1,...,n}. 
If Gy = Op41,p = °° = Gn,» — 0, then 
is 
ae 
Sila 
1 
0 
0 


is a nontrivial solution, which contradicts our assumption. 
Suppose that ap, 4 0. We multiply A by Sp(a;,°), to get 1 as the (p,p) 
entry, and then by R,.»(—arp) for every r # p. The result is a matrix of the 


22 Chapter 1: Vector Spaces 


form 
/ if 
1 0 O Ginit =: Fn 
if / 
0 1 QO G@o541 +++ Gon 
uf if 
Oe ee ee ae 
/ / Uy, 
A= | 050 20 Series = Ooi 
/ / 
OD Oa ee Opie 
i / 
OO 0G i) Gotan 
Ve y 
Ot. ee Oa eee on 


Note that, since every matrix of the form S;(a), with a 4 0, or Rjx(a) is 


Ty 
invertible, a solution of the equation A’ | : | = 0 is also a solution of the 
Tn 
Ly pal 
equation A | : | = 0 and consequently the equation A’ | : | = 0 has only 
cor oe 


the trivial solution. 

If ap» = 0 but aq, # 0 for some g > p, we first multiply A by Rpg(1) to 
get dq,» as the (p, p) entry, and then we continue as in the case ap,, # 0. Using 
induction on p we can show that there are matrices Fj,..., E;, such that 


BiecinA— th. 


where each matrix E;, is of the form 5;(a), with a £ 0, or Rjx(a). This gives 
us 


PIS see ae 


proving the result, because a product of invertible matrices is invertible. O 


Example 1.3.8. Show that the functions 1, cost, cos 2¢ are linearly indepen- 
dent in Fix(R). 


Solution. Assume 
a+bcost + ccos2t =0 


1.3. LINEARLY INDEPENDENT VECTORS AND BASES 23 


for some a,b,c € K. Substituting t = 0, 5,7 gives us the system of equations 


a+ b+ c =0, 

a = @¢ =i), 

a- b+ c =0. 
Since the only solution of this system is a = b = c = QO, the functions 
1, cost, cos 2t¢ are linearly independent. O 


From Theorem 1.3.3 we get the following obvious corollary. 


Corollary 1.3.9. Linearly independent vectors are nonzero. 


Another easy but useful fact about linearly independent vectors is that if we 
remove some vectors from a linearly independent set of vectors, then the new 
smaller set is still linearly independent. 


Theorem 1.3.10. If vi,...,Vn are linearly independent vectors in a 


vector space V, then for any distinct indices i1,...,im € {1,...,n} the 


vectors Vi,,..-,;Vi,, are linearly independent. 


™m 


Proof. Since 
LjViy Tt 1 Li, Vin, = T1V1 T+ + InVn 


where «; = 0 for every j not in the set {t1,...,im}, lear independence of the 
vectors Vi,,.-.,Vi,, 18 an immediate consequence of linear independence of the 
vectors V1,.--,Vn- O 


The converse of the statement in Theorem 1.3.10 is not true, in general. 
More precisely, if every proper subset of {vi,..., Vx} is linearly independent, it 
is not necessarily true that the vectors v,,...,Vx are linearly independent. For 
example, the vectors (1,0), (0,1), (1,1) are linearly dependent even though any 
two of them are linearly independent. 


If v € Span{vi,...,vx%}, then v can be written as a linear combination 
of vectors v1,...,Vx%, but this representation is not necessarily unique. For 
example, 


(1,-1) € Span{(1, 0), (0, 1), (1, 1)} 


and we have 
(1,-1) =(1,0)— (0,1) and (1,-1) = (1,1) — 2(0,1). 


Uniqueness of representation of a vector in terms of the spanning set is a de- 
sirable property that is essential in many arguments in linear algebra and in 


24 Chapter 1: Vector Spaces 


applications. It turns out that linear independence of the spanning set guar- 
antees uniqueness of the representation. This is one of the main reasons why 
linear independence is so important in linear algebra. 


Theorem 1.3.11. Jf vi,...,Vn are linearly independent vectors in a 
vector space V and 


CyVy tee Fen Vn = divi +--+ +dnvn, (1.1) 


for some c1,.--,€n,d1,.-.,dn € K, then 


C1 = A603 55 2.5 Cn, = das 


Proof. From (1.1) we get 
(C1 = di)v1 Se (Cn — dn)Vn = 0, 


and thus 
cy — dy =-::=Cy —d, = 0,~7 


by linear independence of the vectors v1,...,Vn- oO 


The converse of the above theorem is also true: 


If for every v € Span{vi,...,Vn} there are unique numbers c1,...,Cn € K 
such that v = c1v1 +--+ +CnVn, then the vectors v1,...,Vn are linearly inde- 
pendent. 


Indeed, if c¢:v1 +---+CnVn = O, then since we also have Ov; +---+0v, = 0, 
we must have cy = --- = Cn = 0, by the uniqueness. 


Definition 1.3.12. A collection of vectors {v1,...,Vvn} in a vector space 
Y is called a basis of V if the following two conditions are satisfied: 


(a) The vectors v;,...,V,, are linearly independent; 


(b) V = Span{vi,...,Vn}- 


If {vi,...,Vn} is a basis of a vector space V, then every vector in V has a 
unique representation in the form v = cyv, +--:+¢nVn, by Theorem 1.3.11. 
Note that Theorem 1.3.4 implies that a basis in a vector space V is minimal in 
the following sense: If you remove even one vector from a basis in Y, it will no 
longer span VY. 


1.3. LINEARLY INDEPENDENT VECTORS AND BASES 25 


Example 1.3.13. Show that the set {1,¢,t?,...,¢"} is a basis in P,,(K). 


Solution. This result is a consequence of the Examples 1.2.17 and 1.3.6. O 


Example 1.3.14. The set of all real solutions of the differential equation 


is a real vector space. The set of functions {1,e’,e~} is a basis of this vector 
space. 


Example 1.3.15. The set of matrices 


tb deb d-{o s) 


is a basis in the vector space of all matrices of the form E ( , where a, b and 


c are arbitrary numbers from K. 


Example 1.3.16. Let vi,...,Vv» be arbitrary vectors in a vector space V. If 
v, # 0, then we say that v; is a pivot. For k > 2, vx is called a pivot if 
vz ¢ Span{vi,...,Vz—-1}. Show that the set of all pivots in {v1,...,vn}isa 
basis of Span{vi,...,Vn}. 


Solution. First we show by induction on k that 
Span{vi,..., ve} = Span{v;,,.--, Vj}, 


where vj,,.-.,Vj,, are the pivots such that 1 < jj) <--+<jm<k. Fork=1 
the result is trivial. 
Assume now that 


Span{vi,..., ve} = Span{v;,,.--, Vy}, 


where vj,,...,Vj,, are the pivots such that 1 < 71 <--+<jm<k. 
If vz41 is a pivot, then 


Span{v1,...,Vk,Ve+i} = Span{vj,,---;Vyjm,Ve+1}- 


26 Chapter 1: Vector Spaces 


If we let jm4i1 =k +1, then the vectors Vj,,..-,VjnsVjms1 are the pivots such 
that 1 < 71 <---<Jm <Jm41 < kK +1. 

If vz41 is not a pivot, then v;,41 € Span{vi,...,v~} = Span{v;,,..-,Vvj,,}- 
Consequently, 

Span{vi,..., Ve, Ve+i} = Span{v;,,...,Vy,, } 

and the vectors v;,,...,Vj,, are the pivots such that 1 < 71 <-++< jm <k+1. 

Now we show by induction on k that, ifv;,,...,vj,, are the pivots such that 
1< ji <-+++< jm <>k, then the vectors v;,,...,Vj,, are linearly independent. 


The statement is trivially true for k = 1. Now assume that it is true for some 
k > 1 and that v;,,...,vj,, are the pivots such that 1 < 71 <---< jm <k+1. 
Let 

L1V5, feeet Lm Vim =0 


for some 21,...,2m € K. If am #0, then v;,, € Span{v;,,...,Vj,,_,}, contra- 
dicting the assumption that v;,, is a pivot. Consequently, 7, = 0 and, since 
1< ji <--++ <Jm-1 < k, we must have 


i — ie — 0, 


by our inductive assumption. This shows that the vectors v;,,...,Vj,, are 

linearly independent. 
We have shown that the set of the pivots in {vj,...,vx} is a basis of 
Span{vi,...,Vx}, for any k < n. If we take k = n, we get the desired result. 
Oo 


Example 1.3.17. Let Ej, be the m x n matrix such that the (j,k) entry is 1 
and the all other entries are 0. The set of matrices {Ejx,,1 <j <m,1<k<n} 
is a basis of the vector space Min xn(K). 


Example 1.3.18. The standard Gaussian elimination method can be applied 
to a matrix with complex entries in the same way as in the case of a matrix 
with real entries. For any A € Mm xn(K) the set of pivot columns is a basis 
of C(A), that is, the vector subspace of K”™ spanned by the columns of A. 


Definition 1.3.19. Let {vi,...,v,} be a basis of a vector space V and 


let x € V. The unique numbers c1,...,€, such that x = cyvy +--+ +¢nVn 
are called the coordinates of x in the basis {v1,..., Vn}. 


1.3. LINEARLY INDEPENDENT VECTORS AND BASES 27 


Example 1.3.20. Let a be an arbitrary number in K. Show that 


{l,t-—a,...,(¢-—a)"} 


is a basis of P,,(K) and determine the coordinates of an arbitrary polynomial 
in this basis. 


Solution. We first show that the polynomials 1,t—a,...,(t— a)” are linearly 
independent. Suppose that we have 


where %,..-,%n are some numbers from K. 
If we let t = a we get xp = 0. Next we differentiate the above equality and 
get 
xy + 27o(t — a) +---+na,(t—a)”' =0. 


Again we let t = a and get x; = 0. We continue in this way to show that 
vq = ++: = £, = 0. Consequently the polynomials 1,t — a,...,(t — a)” are 
linearly independent. 

To finish the proof we will show that for any polynomial p € P,(IKK) we 
have 


p(t) = p(a) + p'(a)(t— a) +--+ + “v(a)(é — a)”. (1.2) 
Let p € P, (KK) be such that 


p(t) =ao tart +--+ + ant”, 


for some do, @1,...,@n € K. Since we can write 
p(t) =ap t+ai(t—ata)+---+a,(t-—at+a)", 
it is easy to verify, using the binomial expansion, that there are bo, b1,...,bn € 
K such that 
p(t) = bo + b(t — a) +--+ + bn(t — a)” 


Thus Span{1,t—a,...,(t-—a)"} = P,(R). 
Since p(a) = bp and, for k = 1,2,...,n, p™(a) = byk!, we have by = 
ap) (a). Hence 


p(t) = p(a) + p'(a)(t—a) +--+ + ~v(ay(t ay 


Note that in this example we use the derivative from Appendix C. O 


28 Chapter 1: Vector Spaces 


Example 1.3.21. Let a be an arbitrary number in K and let 
U = {p € Pn(K) : p(a) = p'(a) = p" (a) = 0}. 


(a) Show that U is a vector subspace of P,,(K). 


(b) Find a basis U. 


(c) Determine the coordinates of an arbitrary polynomial from UY in that 
basis. 


Solution. We are going to use the result from Example 1.3.20. If p € P,(K) 
and p(a) = p'(a) = p”(a) = 0, then 


P(t) = sp" (a)(t = a) ++ +p ™(a(t= a)", (1.3) 
by (1.2). Consequently, 
U = Span{(t—a)?,...,(t—a)"} 


and, since the functions (t — a)?,...,(t — a)” are linearly independent, the set 
{(t—a)3,...,(t—a)"} is a basis of YU. Finally, the coordinates of an arbitrary 
polynomial from U in that basis are already in (1.3). oO 


The following theorem can often be used to solve problems involving linear 
independence of vectors using methods from matrix algebra. 


Theorem 1.3.22. Let {v1,...,Vn} be a basis of a vector space V and 
let 


Wk = Q1kV1 +++ + OnkVn 


for some aj, € K,1< j,k <n. Then the vectors w1,...,Wn are linearly 
independent if and only if the columns of the matriz 


Qi, «+. Alin 


are linearly independent. 


Proof. First note that we can write 


Ly Ty 
LyWyt: +: +LlypWy = [a11 oa Qin : Viteeet [Gna Para Ann] : Vn- 


In In 


1.3. LINEARLY INDEPENDENT VECTORS AND BASES 29 


Now, because the vectors v1,..., Vn are linearly independent, 
U1Wit:::+2%nWn = 0 


is equivalent to 


Ly T1 
[a1 ia) Diya : Sean = [ay oie at : =0 
Zn In 
which can be written as 
Q1i1 «+. Ain Uy 0 
An1 +++ Qnn| | Zn 0 
or, equivalently, 
a1 Gin 0 
LY : tres + 2yn : = 
an Ann 0 
Therefore the vectors w1,...,W,» are linearly independent if and only if the 
11 Gin 
vectors »-.-, |! | are linearly independent, proving the theorem. O 
ani ann 


Example 1.3.23. Let v,, v2, v3 be arbitrary vectors in a vector space. Show 
that the vectors 


2v, + 3v2 + v3, 5v1 + 2v2 + 8v3, 2v1 + 7v2 — 3v3 


are linearly dependent and express one of them as a linear combination of the 
remaining two. 


Solution. Because the reduced row echelon form of the matrix 


Diy 
302 me 
18 -3 
is 
10 # 
01-4 


30 Chapter 1: Vector Spaces 


the vectors are linearly dependent and we have 


31 
2v1 + Tv2 = 3Vv3 =— 


8 
Tr (2v1 + 3vo + v3) = moo + 2vo+ 8v3). 


Example 1.3.24. Show that the polynomials 
t(t ae Ne 2 ats 2t, (t alg Bigs (t al ie 


are linearly independent in P3(K). 


Solution. If we use {1, t, t?, t?} as a basis in P3(K), then the matrix in Theorem 
1.3.22 is 


orhN = 


1 
3 
3 
1 


the polynomials ¢(t— 1), t? + 2t, (t+1)°, (t+ 1)? are linearly independent. O 


Example 1.3.25. Find a basis of the vector space 


ee ape Wel eto 7 i, 2 
me es Ase neea area eina hoes [fe 


Solution. We can solve this problem by finding a basis of the vector space 


(a) Paseo i 
: 2 =p) 2 
re UP Gerdes 
1+i 3—5i i 


1.3. LINEARLY INDEPENDENT VECTORS AND BASES 31 


The reduced row echelon form of the matrix 
1-i -34+57 i 
2 —2 2 
i 6-314 38 
1+%4 -3-—51% -i 


is 


oo fo 
Seo = 
CoO ME NIW 


and thus the set 


([ i a4) [ ata: 3-54} (1.4) 


is linearly independent. Since 
i 2) 3 il 2 ae —3+5% —2 
3 -i| 2 a 1+% 2) 6-31 -3-51]’ 


_ oy este: =) 
y= Span {| ome 6—3i Bent 


so the set (1.4) is a basis of VY. oO 


we have 


Example 1.3.26. Let a, b, and c be distinct numbers from K. Find a basis 
in the space 


U = {p € Pn(R) : p(a) = p(b) = p(c)}, 
where n > 3. 
Solution. Let & = p(a) = p(b) = p(c) and q(t) = (t—a)(t —b)(t—c). Ifp eu, 
then 
p(t) = s(t)q(t) +h, 


where s(t) is 0 or a polynomial of degree at most n — 3. In other words, 
U ={sq+k: 5 € Py_3(K),k € K}. 


Consequently, 
{1, q(t), ta(t),-..,¢” °a(t)} 


is a basis of U/. 


32 Chapter 1: Vector Spaces 


1.4 Direct sums 


In Section 1.2 we introduced sums of subspaces. Now we are going to refine 
that idea by introducing direct sums which play a much more important role in 
linear algebra. 


Definition 1.4.1. Let V,,...,V, be subspaces of a vector space V. The 
sum 
Vit---+Vn 


is called a direct sum if every v € Vi +---+V, can be written in a unique 


way as a sum vj +---+v, with v,; € V,; for every 7 € {1,...,n}. To 
indicate that the sum V, + ---+ V,, is direct we write 


Vi +++ BVn. 


The condition of uniqueness of the representation vj + +--+ vn with vj € 
Y,; for every j € {1,...,n} is similar to a condition that characterizes linear 
independence of vectors. This is not a coincidence. As we will soon see, direct 
sums and linear independence are closely related. The following theorem gives a 
simple condition that characterizes direct sums. It should not be a surprise that 
it is similar to the condition that characterizes linear independence of vectors. 


Theorem 1.4.2. Let V1,...,Vn be subspaces of a vector space V. The 
sum Vi ++--+V, ts direct if and only if the equality 


vit--:+vn =0 


with v; © V; for every j € {1,...,n}, implies v] =--- =v, =0. 


Proof. Assume that the sum VY; +---+ V,, is direct and that 


Vit-r-tvn=90 


with v; € V; for every 7 € {1,...,n}. Since we also have 
0+---+0=0 
and 0 € V; for every 7 € {1,...,n}, we must have v; = --- = v, = 0 by the 
uniqueness requirement in the definition of direct sums. 
Now assume that for any vectors v; € V,; for 7 € {1,...,n}, the equality 
vi +--:-+v, = 0 implies v; =--- =v, = 0. We need to show the uniqueness 


property. If 


1.4. DIRECT SUMS 33 


with u;,w, € V; for every j € {1,...,n}, then 
uy — Wi, +--:+Un — Wy, = 0. 
Since u; — w, € V, for all 7 € {1,...,n}, we can conclude that u; = w; for all 


gj € {i1,...,n}. oO 


Note that, if a sum VY, + --- + VY, is a direct sum and kj,ko,...,km € 
{1,2,...,n} are distinct indices, then the sum Vy, +---+ Vx, is also direct. 


Example 1.4.3. Suppose that U, Uv’, VY, and V’ are subspaces of a vector 
space W. If 
U=U'e(UNV) and V=V'eUNyY), 


show that 
U+V=U OUNV)IEYV. 


Solution. Because it is easy to see that 
U+VHU4+UNV)4V 
we only have to show that the sum is direct. Suppose that 
u+wi+v=0O0, 
where uEU’, weEUNY, andve VY’. From 
v=-u-wel’eUny)=u 


we get 
veVnucyvny, 


which means that 
ve Vinny) = {0}. 


Since v = 0, we have u+ w = 0 and thus u= w = O. O 


Theorem 1.4.4. Suppose V and W are subspaces of a vector space XV. 


The sum V+ W is direct if and only if VOW = {0}. 


Proof. If the sum V+ W is direct and v = w, with v € V and w € W, then the 
equality 0 = v — w implies v = w = 0. 

Now, if VNW = {0} and v+ w = 0 with v € V and w € W, then v = —w. 
Hence v = w = 0 because —w € W. O 


34 Chapter 1: Vector Spaces 


Example 1.4.5. Show that the set 
U = {pe P,,(K) : t? + 1 divides p}, 
where n > 3, is a subspace of P,,(IK) and that we have 
U & P2(K) = Pr(R). 


Solution. It is easy to show, as a consequence of the definitions, that U is a 
subspace of the vector space P,, (IK). 
If g € P,(K), then there are polynomials r and s such that 


q= s(t? +1) +7, 
where r = 0 or degr < 2. Since s(t? + 1) € U, we have shown that 
U + Po(K) = Pp(K). 


Clearly the sum U/ + P2(K) is direct. Oo 


The property in Theorem 1.4.4 does not extend to three or more subspaces. 
It is possible that UN VN W = {0}, but the sum U/ + V+ W is not direct. We 
leave showing this as an exercise. 

The next theorem makes the following statement precise: The direct sum of 
direct sums is direct. 


Theorem 1.4.6. Let n,,n2,...,%m be arbitrary positive integers and 
let V;,, be subspaces of a vector space V forl1<k<mand1<j < ng. 
If the sum 


Un = Vik + V2,b +++ + Vang k 


is direct for every1<k<™m and the sum 
Uy +--+ +Um 
is direct, then the sum 
Via se See Vas are Vi et Ee Vien 


is direct and we have 


Vii @-**OVny 1 Oe ++ @ Vn wm =U @-++OUm. 


Proof. Suppose 


Vitter: } Vni1 fp... 4 Viym fo... 4 Virnm,m — 0. 


1.4. DIRECT SUMS 35 


Since 


Vad eee iad ht Ev gy Ee, it 
= (vy tees Vn) $+ + (Vim +22 + Vn sm); 


we have v1.4 +:*:+Vn,,k = 0 for every 1 <k <m and thus 


Vil reek Vnj,l hee Vik re Vn,,k 0. 


This shows that the sum V,,1 +-:-+Vn,1+:+:+Vin+:+*+V¥n,,k is direct. The 
last equality in the theorem is an immediate consequence of Theorem 1.2.13. O 


Now we formulate and prove a theorem that connects direct sums with linear 
independence. 


Theorem 1.4.7. Ifv1,...,Vn are linearly independent vectors in a vec- 
tor space Y, then the sum 


Kv, +--+ Kvn 


is direct. 


Proof. Assume the vectors v1,...,Vn are linearly independent. If 


u,;+-:-:-+u, =0 


with u; € Kv, for j € {1,...,n}, then u; = x;v,; for some a,...,%, € K. 
Then 

U1Vyi +++ + 2nVn -—0 
and, since the vectors v1,..., Vn are linearly independent, 7; = --- = %, = 0. 
But this means that u; =--- = u, = 0, proving that the sum Kv; +---+Kv,, 
is direct. O 


The conditions in the above theorems are not equivalent. We need an addi- 
tional assumption to get the implication in the other direction. 


Theorem 1.4.8. If vi,...,Vn are nonzero vectors in a vector space Y 
and the sum 
Kv, +---+Kv, 


is direct, then the vectors v1,...,Vn are linearly independent. 


36 Chapter 1: Vector Spaces 


Proof. Assume the vectors vj,...,Vn are nonzero and the sum Kv; +---+Kv, 
is direct. If 


Vi +++: +2nVn = O, 


then we get 21Vv1 = ++: = %nVpn =O. Since the vectors v,,...,Vn, are nonzero, 
we must have 7; =--: = £, = 0, proving that the vectors v1,...,V, are linearly 
independent. oO 


Example 1.4.9. Show that the sum 
Spani¢ aig oer spans 8 t-- 2 ar YP 


is a direct sum of subspaces of P3(K). 


Proof. It suffices to check that the functions t? + 1,t? + 3,2 + ¢+ 2,0? + 3¢? 
are linearly independent and then use Theorems 1.4.6 and 1.4.7. O 


The next theorem describes another desirable property of direct sums. In 
general, if {uy,...,u,} is a basis of U and {vj,...,Vm} is a basis of V, then 
{ui,...,Ug,V1,---,Vm} need not be a basis of (+ V. It turns out that we don’t 
have this problem if the sum is direct. 


Theorem 1.4.10. Let V\,...,Vn be subspaces of a vector space V such 
that the sum 
Vit---+Vn 


is direct. If {Vi,9,++-)Vk;,g} 18 @ basis of Vy for each 1< 7 <n, then 


{vi, sais »Vk1,1) eis +> Vins avis Vit 


is a basis of Vi ®-+: PB Vy. 


Proof. According to Theorem 1.4.6 we have 


VY. @-:-®Vy = (Kvii ©:-: @ Kvy, 1) ©: +: @ (RV n © ++: @ Kvy,, rn) 
= Kvii ©-:: @ Kye, 1 8°: @ Kvn ©: +: © Kve,, in 


and thus 
Vi ®-++ @®Vy = Span{vii,-. +, Vets + ++) Visny +++) Vk, nf: 


Now the result follows by Theorem 1.4.8. O 


1.4. DIRECT SUMS 37 


Example 1.4.11. Show that the sum 
Span{1, cost, cos 2t} + Span{sint, sin 2¢} (25) 
is a direct sum of subspaces of F(R) and that 
{1, cost, cos 2¢, sin t, sin 2t} 
is a basis of Span{1, cost, cos 2t} @ Span{sin t, sin 2¢}. 
Solution. If ao, a1, 42, 61, b2 € K are such that 
do + a; cost + a2 cos 2t = by sint + b2 sin 2t, 
then also 
ag + a; cos(—t) + ag cos(—2t) = b sin(—t) + be sin(—2t), 
which simplifies to 
ao + a, cost + ag cos 2t = —(b; sint + be sin 2t). 


Consequently, 
ao + a; cost + ag cos 2t = 0 


and 
b; sint + bo sin 2t = 0. 


We have shown in Example 1.3.8 that the functions 1, cost, cos 2¢ are linearly 
independent. It can be shown, using the same approach, that the functions 
sint,sin2t are linearly independent. Linear independence of the functions 
1, cost,cos2t gives us dg = a, = ag = O and linear independence of the 
functions sint, sin 2t gives us b; = bp = 0. Hence, by Theorem 1.4.4, the sum 
(1.5) is direct and the set {1, cost, cos 2t, sint, sin 2t} is its basis, by Theorem 
1.4.10. Oo 


Example 1.4.12. Show that the sum 
Snxn(IK) + Anxn (IK) 
is direct and we have 
Snxn(K) ® Anxn(K) = Mnxn(K), 


where 


Snxn(K) = {A € Maxn(K) : AT = A} 


38 Chapter 1: Vector Spaces 


and 


Solution. If A € Spxn(K)NApxn (KX), then A= A’ =—A? and thus A= —A. 
Consequently, the entries of the matrix A are all 0. This shows that the sum 
Snxn(K) + Anxn(K) is direct, by Theorem 1.4.4. 

Now we observe that any matrix A € Mn x,(K) can be written in the form 


A= 5A + AT) + 5A — A’). 


Since A+ AT € Snxn (KK) and A— A? € Anxn(K), we have A € Spxn(K) ® 
Ake Oo 


So far in this section we were concerned with checking whether a given sum 
of subspaces was direct. Now we are going to investigate a different problem. We 
would like to be able to decompose a given vector space into subspaces that give 
us the original vector space as the direct sum of those subspaces. In practice, 
we often want those subspaces to have some special properties. We begin by 
proving a simple lemma that gives us the basic ingredient of the construction. 


Lemma 1.4.13. Let U be a subspace of a vector space V and letv € Y. 


Ifv €U, then the sum U + Kv is direct. 


Proof. Suppose 
u+av=0 


forsome u€U anda €K. If a 40, then v = —tu € U, which is not possible, 
because of our assumption. Consequently, a = 0 and thus also u = 0. oO 


Theorem 1.4.14. Let U be a subspace of a vector space VY. If 
VY = Span{vi,...,vn}, then there are linearly independent vectors 


Vizs+++sVim © {V1,--+;Vn} such that 


Vv =U@Kvi, ©-:-@ Kvi,,. 


Proof. If U = V, then we have nothing to prove. 

Now, if U # V, then there is a vector v;, € {vi1,..-,Vn} such that v;, ZU. 
According to Lemma 1.4.13 the sum U + Kv,, is direct. If U @ Kv,;, = V, then 
we are done. If @ Kv;i, 4 V, then we continue using mathematical induction. 
Suppose that we have proven that the the sum 


U + Kv;, +--+ Kvi, 


1.4. DIRECT SUMS 39 


is direct. If U @ Kv,, @--- @ Kv;, = V, then we are done. If U6 Kvi, 6 


---® Kvi, 4 V, then there is a vector vi,,, € {V1,---,Vn} which is not in 
U+ Kyi, +--- + Kvy,. But then the sum U + Kv;, + --- + Kvy, + Kvi,,, is 
direct by Lemma 1.4.13 and Theorem 1.4.6. 

Since the set {vi,...,Vn} has a finite number of elements, we will reach a 
point when U 6 Kv;, ©--:®@ Kv; = V. O 


Corollary 1.4.15. Let U be a subspace of a vector space V. If V = 
Span{vi,...,Vn}, then there is a subspace W of the vector space V such 


that 
V=UBW. 


Proof. W = Kv;, ®---@Kv;,,, where vi,,..., Vi, € {V1,---, Vn} are the vectors 
in Theorem 1.4.14. O 


A subspace W of a vector space V such that V = U © W will be called a 
complement of U in V. It is important to remember that a complement is not 
unique. For example, both W = {(t,t) : ¢ © R} and W = {(0,t) : t € R} are 
complements of U = {(t,0) : t € R} in the vector space V = R? = {(s,t) : 
s,t © R}. In Chapter 3 we introduce a different notion of complements, namely 
orthogonal complements, and prove that such complements are unique. 


Corollary 1.4.16. Let V be a vector space. If V = Span{vi,...,Vn} 
and the vectors w1,...Wr € V are linearly independent, then there are 


linearly independent vectors Vi,,--+; Vin, © {V1,;---;Vn} such that the set 
{W1,..-Wk,Vi,;-+-,Vi,, } is @ basis of V. 


Proof. We apply Theorem 1.4.14 to the subspace 
W = Kw, 6-:-@Kw;, 
and we get linearly independent vectors v;,,...,Vv:,, such that 


YV=WoeKv;, ©:-- 6 Kvi,, = Kw © -:- 6 Kwe © Kv, © --- 6 Rvi,,. 


This means that the set {wi,...,Wx,Vi,,---,Vi,,} is a basis of V. oO 


Corollary 1.4.17. Let V be a vector space. If V = Span{vi,...,Vn}, 


then there are linearly independent vectors vi,,.--,Vin, © {V1,--+;Vn} 
such that the set {vi,,...,Vi,,} is a basis of V. 


40 Chapter 1: Vector Spaces 


Proof. We apply Theorem 1.4.14 to the subspace W = {0}. 
Note that in Example 1.3.16 we give an alternative proof of the above fact. 
O 


Theorem 1.4.18. Let U, V, and W be subspaces of a vector space X 
such that 
Veu=Weu. 


If V has a basis with m vectors, then W has a basis with m vectors. 


Proof. If {vi,...,Vm} is a basis of V, then V = Kv; @---@ Kv». Since 
Kvi@::-S6Rv, CBU=WEU,Z 
for every 1 < 7 < ‘m, we have 
Vj =Ww;jt+Uu;, 


for some w; € W and u; € UYU. We will show that {wi,...,Wm} is a basis of W. 
Indeed, if 7). w) +---+2%mWm = 0, then 


TV +t Fim Vm = 11 +++ + LmUm 


and thus 21v1+---+2%mVm = 0, because u; € U and Span{vi,...,V¥m}NU =0. 


Consequently, 71 = --- = £m = 0, proving linear independence of vectors 
Wi,---,Wm- 
Now, if w is an arbitrary vector in W, then there are numbers 21,...,%m € K 


and a vector u € U such that 


WH= MV +e + imVm ft UH WW ti tem Wm + 1 + +X mUm + U. 
Since WNU = {0}, we must have 
wWw=2Wwi ts: +lmWm, 


proving that W = Span{wi,...,wm}. oO 


Example 1.4.19. Let VY, and V2 be arbitrary vector spaces over K and let 
W= VY, x V2. Then 


Wi = Vy x {0} and W — {0} x i) 
are subspaces of W. It is easy to see that 
W=W, BW2. 


More generally, if Vj,...,V, are arbitrary vector spaces over K and W = 


1.5. DIMENSION OF A VECTOR SPACE Al 


VY, X +++ xX Vn, then the spaces 


W, = V1 x {0} x --- x {0} 
Wi or eee peeiore 10) 


Wr = {0} x +++ x {O} x Va 
are subspaces of W and we have 
W=W18-::BWh. 


Note that we cannot write W = V, ®---@ V, because V1,...,Vn are not 
subspaces of W. 


1.5 Dimension of a vector space 


Dimensions of R? and R? have a clear and intuitive meaning for us. The notion 
of the dimension of an abstract vector space in less intuitive. Before we can 
define the dimension of a vector space, we first need to establish some additional 
properties of bases in vector spaces. To motivate an approach to the proof of 
the first important theorem in this section (Theorem 1.5.2) we consider a special 
case. 


Example 1.5.1. If a vector space V = Span{vj, V2, v3} contains three lin- 
early independent vectors w 1, w2,w3, then the vectors v1, V2, v3 are linearly 
independent and 


Span{vj, V2, v3} = Span{wi, we, ws}. 


Solution. Let w1, Wwe, w3 be linearly independent vectors in Span{vj, v2, v3}. 
Then Span{w}, w2, w3} C Span{vi, ve, v3} and 


Wi = 411V1 + A21V2 + 431V3 
W2 = 412V1 + A22V2 + 432V3 


W3 = 413V1 + A23V2 + 433V3, 


for some aj, € KK. Let 
Q411 412 413 

A= }a21 G22 33 

431 32 433 


42 Chapter 1: Vector Spaces 


It is easy to verify that the equality 


implies 
LW, + L2W2 + %3W3 = 0. 


Since the vectors w 1, W2, W3 are linearly independent, we must have x, = 2 = 


x3 = 0. But this means that the matrix A is invertible. Let 


bi by2 dig 
B= |be1 beg beg) = Aq?. 
631 632 633 


Now we note that 


bi1W1 + b21 We + b31w3 = (611411 + ba1a12 + b31413)V1 


+ (bi1421 b21422 631423) V2 


+ (b11a31 + b21432 + b31433) V3 
=1v, + Ove AP Ov3 =Vi1; 


that is 
Vy = 61, w, + bo We + b31 ws. 


Similarly, we get 


V2 = bi2w1 + bo2We + b32Wws, 


v3 = b13w1 + bo3We + b33Ww3. 


Consequently Span{vj, V2, v3} = Span{wi, we, ws}. 
To show that the vectors v1, V2, v3 are linearly independent suppose 


Y1Vi + Y2V2 + y3Vv3 = 0 


for some ¥1, y2, y3 € K. Then 


yi (b11 wi + bo1W2 + a31w3) + yo(bi2w1 + b22w2 + b32ws3) + y3(b1gwi + b23W2 + b33w3) = O 


or, equivalently, 
(y1b11 + yob12 + y3b13)wi + (y1b21 + yobe2 + y3b23)we + (yib31 + y2b32 + y3b33)w3 = 0. 
Since the vectors w1, W2, w3 are linearly independent, we must have 


Y1bi1 + Yobi2 + y3b13 = y1ba1 + y2b22 + y3b23 = y1b31 + y2b32 + y3b33 = 0, 


1.5. DIMENSION OF A VECTOR SPACE 43 


which can be written as 


Yi 0 
B Y2 = 0 
Y3 0 


But this means that yi = y2 = y3 = 0, because the matrix B is invertible. 
Thus the vectors v1, v2, v3 are linearly independent. O 


And now the general theorem. 


Theorem 1.5.2. Let V be a vector space and let vi,...,Vn € V. If 
vectors W1,...,Wn © Span{vi,...,Vn} are linearly independent, then 
the vectors V1,...,Vn are linearly independent and 


Span{vi,...,Vn} = Span{wi,...,wnr}. 


In other words, every collection of n linearly independent vectors wj,...,Wn 
in Span{vi,...,Vn} is a basis of Span{vi,...,v,} and the spanning set 
{vi,.--,Vn} is also a basis of Span{vj,...,Vn}. 


Proof. The proof follows the method presented in detail in Example 1.5.1. 
Let wi,...,W» be linearly independent vectors in Span{vi,...,vn}. Then 
Span{wi,...,wn} C Span{vi,...,v,} and for every 1 < k < n we have 


Wk = Q1kV1 +++ + GnkVn, 


for some aj, € K, 1 <k <n. Let 


G11 «+. Ain 
A= 
GQn1 .-. Onn 
If 
Ly 0 
Ali|= , 
Ti 0 


then we have 
LW, +-++ + LpWr = O, 


and, since the vectors w1,...,W,» are linearly independent, x1] =---=2, = 0. 
This means that A is invertible. Let 
bit... din 
B = : —4 Av} 


44 Chapter 1: Vector Spaces 


Now, because we have 
Ve = bigwi t-++ + OnkWn, 


for every 1 < k < n, we have Span{vj,...,Vn} = Span{wi,...,w,}. Moreover, 
the equality 
yivi +++++YnVn = 9, 


implies 
Yi 0 
so eee eae 
Yn 0 
and thus y; = --: = yn = 0, because B is invertible. But this means that the 
vectors V1,..-,Vn are linearly independent. oO 


The above proof depends heavily on properties of invertible matrices. Below 
we give a proof that does not use matrices at all. It is based on mathematical 
induction. 


Second proof. We leave as exercise the proof for n = 1. Now we assume that 
the theorem holds for n — 1 for some n > 2 and show that it must also hold for 
n. 

Let wi,...,W» be linearly independent vectors in Span{vj,...,V,}. As in 
the first proof, for every k = 1,2,...,n we write 


Wk = Q1kV1 +++ + AnkVn- 


Since w, 4 0 and wy = ay,v1 +--+ +4n1Vn, one of the numbers aj1,...,Qn1 iS 
different from 0. Without loss of generality, we can assume that ai; # 0. Then 


1 1 a21 Ani 
Vpi= —(wi — €21V2 —-¢¢¢+ Gn1Vn) = —wi —- —Vv2-°+**-— Vn. (1.6) 
Q11 a11 Q11 Q11 


For k > 2 we get 


Qik Q1k421 A1k4n1 
Wk — ——W1 = | Gar — Vater | Qnk — Vn- 


a11 a11 Q11 
The vectors w2 — wi, ..6 Wn aw are linearly independent. Indeed, if 
a12 Gin 
x2 | We - “2; ) +++ 2p (w = “yn = 0, 
a11 11 
then 


1.5. DIMENSION OF A VECTOR SPACE 45 


This implies v2 = --- = £, = 0, because the vectors wj,...,W, are linearly 
independent. 
Now, since w2 — ew, .++,Wn — Swi are linearly independent vectors in 
Span es rongVants by the inductive assumption we have 
a12 Gin 
Span {ws — —wyi,.--,Wn— Stay | = Span{vo,...,Vn} 
a11 Q11 
and the vectors v2,...,V» are linearly independent. Consequently, 
a2 Qin 
Span{wi,...,Wn} = Span 4 wi, w2 — —wy,...,Wn — —wi 
Q11 Q11 
= Span{wj,v2...,Vn} 


= Span{vi,...,Vn}, 


where the last equality follows by (1.6). It remains to be shown that the vectors 


V1,---,Vn are linearly independent. We argue by contradiction. 

We already know that the vectors v2,...,Vn are linearly independent. Sup- 
pose v; € Span{v2,...,Vn}. Then 

Span{vo,...,Vn} = Span{vj,vVo,...,Vn}- 

Since w2,...,W» are linearly independent vectors in Span{ve2,..., Vn}, we have 
Span{vo,...,Vn} = Span{wo,...,wn}, by our inductive assumption. Con- 
sequently, we have Span{w2,...,w,} = Span{vi,v2,...,Vn} and thus also 
wi © Span{wo,...,Wn}, which contradicts linear independence of the vectors 
W1,W2,---,Wn-. This proves that vi ¢ Span{v2,...,Vv,} and therefore the vec- 
tors V1, V2,---;Wn are linearly independent. O 


Because Theorem 1.5.2 is of central importance in the discussion of the 
dimension of a vector space, we give still another proof. It is an induction proof 
that uses Corollaries 1.4.16 and 1.4.17 and Theorem 1.4.18. It is instructive to 
study and compare these three different arguments. 


Third proof. We leave as exercise the proof for n = 1. Now we assume that the 
theorem holds for all integers p < n and show that it must also hold for n. 


Let wi,...,Wn be linearly independent vectors in Span{vi,...,Vn}. 

First we show that the set {wi,...,w,} is a basis of Span{vi, vo,...,Vn}. 
We argue by contradiction. If the set {w1,..., w,} is not a basis of Span{vj, vo, 

.; Vn}, then, by Corollary 1.4.16, there is an integer r > 1 and linearly indepen- 
dent vectors vj,,---, Vi, € {V1,---,;Vn} such that the set {wi,...,Wn,Vi,,---; 
vi, } is a basis of Span{v1,V2,...,Vn}. Now, again by Corollary 1.4.16, there 
are vectors Vj,,..-,Vj, © {V1,---,Vn} such that 
{Vj sey Vins Vizs-- pV} 

is a basis of Span{vj,v2,...,Vn}. Note that we must have p < n, because if 
p=n then 


Span{v;,,-..,V;,} = Span{vi,...,Vvn}, 


46 Chapter 1: Vector Spaces 


which is not possible because r > 1. 

Since the subspaces Span{wi,...,wWn} and Span{vj,,...,Vv;,} are comple- 
ments of the same subspace Span{v;,,..., Vi, }, the subspace Span{wi,...,wWn} 
has a basis with p elements {u1,...,u,}, by Theorem 1.4.18. But then the lin- 
early independent vectors w,,..., Wp are in the subspace 


Span{u;,...,u,} = Span{w),...,wn}. 


Consequently, by our inductive assumption, {wi,...,Ww,} is a basis of Span{wy, 
...,;W,}. But this contradicts independence of the vectors wi,...,Wn, because 
p <n. This completes the proof of the fact that {wj,...,wn»} is a basis of 
Span{vi,V2,.--,Vn}- 

Now we show that the vectors v1,...,V, are linearly independent. Again 
we use a proof by contradiction. If v,,...,v, are not linearly independent, 
then, by Corollary 1.4.17, there is an integer r < n and a set {vi,,..-,vi,} C 
{vi,.-., Vn} of linearly independent vectors such that 


Span{vi,,..-,Vi,} = Span{vj,...,Vn}. 


But then the linearly independent vectors w1,...,w, are in Span{v;,,..., Vi, } 
= Span{vi,...,vn}. Consequently, by our inductive assumption, {w1,...,w,} 
is a basis of Span{vj, V2,..., Vn}, contradicting linear independence of the vec- 
tors W1,...Wn, because r < n. This contradiction proves that the vectors 
V1,---,;VWn are linearly independent. oO 


The next theorem is an easy consequence of Theorem 1.5.2. 


Theorem 1.5.3. If {vi,...,Vn} ts a basis of the vector space V, then 


any n+1 vectors in V are linearly dependent. 


Proof. Let wi,...,Wn,Wn+1 € V. If the vectors w1,...,W,» are linearly depen- 
dent, we are done. If the vectors wj,...,W»p are linearly independent, then the 
set {wi,...,W»} is a basis of V, by Theorem 1.5.2. Consequently w,,+1 can be 
written as a linear combination of the vectors w1,...,W,n, which implies that 
the vectors w1,...,Wn,Wn4+1 are linearly dependent. O 


Corollary 1.5.4. If both {v1,...,Vm} and {wi,...,Wn} are bases of a 


vector space V, thenm =n. 


Proof. This is an immediate consequence of Theorem 1.5.3. O 


Now we are ready to define the dimension of a vector space spanned by a 
finite number of vectors. 


1.5. DIMENSION OF A VECTOR SPACE AT 


Definition 1.5.5. If a vector space V has a basis with n vectors, then 
the number n is called the dimension of V and is denoted by dim VY. 


Additionally, we define dim{0} = 0. 


If a vector space Y has a basis with a finite number of vectors, we say that V 
is finite dimensional and we write dim VY < oo. Not every vector space is finite 
dimensional. 


Example 1.5.6. The space of all polynomials P(K) is not a finite dimensional 
space. Suppose, on the contrary, that dim P(K) = n for some positive integer 
n. Then P(K) = Span{pi,...,pn} for some nonzero polynomials pj,...,Dn. 
Let m be the maximum degree of the polynomials p;,...,p,. Since the degree 
of every linear combination of the polynomials p;,...,pp is at most m, the 
polynomial p(t) = t+ is not in P(K) = Span{pi,...,pn}, a contradiction. 


The following useful result can be easily obtained from Theorem 1.5.2. 


Theorem 1.5.7. Let V be a vector space such that dimV = n. 


(a) Any set of n linearly independent vectors from V is basis of V; 


(b) If V = Span{wy,...,wn}, then {wi,...,wn} is a basis of V. 


Proof. Let dimV =n and let {vi,...,Vvn} be a basis of V. 
If the vectors W1,...,Wn» € V are linearly independent, then 


Span{wi,...,Wn} = Span{vi,...,vn} =Y, 


by Theorem 1.5.2. Thus {wi,...,W»n} is a basis of V. 


If V = Span{wi,...,w»}, then the linearly independent vectors v1,...,Vn 
are in Span{wi,...,wW»} and, again by Theorem 1.5.2, the vectors wi,...,Wn 
are linearly independent. This means that the set {w,...,w,} is a basis of 
VY. Oo 


Example 1.5.8. The solutions of the differential equation 
y"+y=0 


is a vector space of dimension 2 over C. Both {e’*,e~”} and {cost,sint} are 
bases in this vector space. 


48 Chapter 1: Vector Spaces 


Example 1.5.9. Let a, b, and c be distinct numbers from K and let 


U = {p € Pn(K) : p(a) = p(b) = p(o)}- 


The dimension of the vector space U is n — 2 (see Example 1.3.26). 


Example 1.5.10. Let a be an arbitrary number from K and let 


U = {p € P,,(K) : p(a) = p'(a) = p" (a) = 0}. 


Then dim = n — 3 (see Example 1.3.21). 


Example 1.5.11. Let n be an integer greater than 1. 
(a) dimS,xn(K) = $n(n + 1) (see Example 1.2.5) 
(b) dim Anxn(K) = $n(n — 1) (see Example 1.2.5) 


Example 1.5.12. Show that 


1 1-21 a 0 

: ] —1 a il 
dim Span vl seen elicit oa foe =3 

0 —2% a 1 


and find all bases of the above subspace, in the set 


1 1-21 i 0 
} —1 1 1 
i}? |—-2-—a}? 1}? Ja 
0 —2% i 1 


Solution. Since the reduced row echelon form of the matrix 


i iw o 0 
a = Oil 
=i Ik Oo 
OR 2 ere 


1.5. DIMENSION OF A VECTOR SPACE 49 


is 


10 40 
01-40 
00 O01 
00 00 
and the sets 
1] fo] fo 1) [| 2] fo 0) [| 2] fo 
Oe o| |-4] Jo 1] |-3| |0 
O;’?> 1017 UL p) Oi o| yl ? O}? Ole |e ? 
0} [Oo] [0 0 9| [0 0 o| [0 


are linearly independent, the following sets are bases: 
eS AEB ES 
a —1 1 a a 1 i a 1 
—t)? |—-2—-a|’ ]a : —i}) {1}? }a : —2—i|’ |1}’ Ja 
0 —21 1 0 a 1 —21 a 1 
O 
In the next example we use Theorem 1.5.7 to obtain the characterization of 


invertible matrices in Example 1.3.7. 


Example 1.5.13. Show that a matrix A € Mn x7(K) is invertible if and only 
if the only solution of the equation Ax = 0 is x = 0. 


Solution. If A is invertible and Ax = 0, then 
x= A !'Ax=A10=0. 


Suppose now that the only solution of the equation Ax = 0 is x = 0. We 
Ly 
write A = [a Ae an| , where aj,..., a, are the columns of A, and x = 


The assumption that the only solution of the equation 


Ly 0 


50 Chapter 1: Vector Spaces 


0 
is |:| implies that the vectors a;,...,a, are linearly independent. Conse- 
0 
quently, by Theorem 1.5.7, the set {a1,...,a,} is a basis of K”. If e1,...,en 
are the columns of the unit matrix [,,, then for any 7 € {1,...,n} we have 


ej = byj;a1 sips uate bnjan, 


for some bj, € K, that is, 
In = AB 


where B is the matrix with entries bx. 
It is easy to see that the only solution of the equation Bx = 0 is x = 0. 
Arguing as before, we can find a matrix C' such that 


i — BGs 


Since 


A= Ae ee a. 


we get 
AB = BA= Ip. 


Example 1.5.14. Show that the dimension of the vector space U in Example 
Ihde Sis als 


Solution. Let x be a vector in R? which is not on the vector line Ra. Conse- 
quently, the vectors x and a are linearly independent. If y is in R? then there 
are real numbers a@ and § such that 


y =aa-+ (x. 
Hence eS 
¥ = (aa+ Bx)* = G@a+ Bx = Bx = BX. 
This means that {xX} is a basis in U and consequently dim = 1. Oo 


In Section 1.3 we remark that linear independence of vectors may depend 
on the field of scalars K. Since the dimension of a vector space is closely related 
to linear independence of vectors (being the maximum number of linearly in- 
dependent vectors in the space), it should be expected that the dimension of a 
vector space also depends on whether the space is treated as a real vector space 
or a complex vector space. 


1.5. DIMENSION OF A VECTOR SPACE 51 


Example 1.5.15. Let V be a complex vector of dimension n. Show that the 
dimension of the real vector space V is 2n. 


Solution. Let V be a complex vector of dimension n and let {vi,...,Vn} bea 
basis of V. This means that every v € V has a unique representation of the 
form 

WAAL Se PO ee Nien 


with 2),..., 2, € C. 
We will show that {v1,...,Vn,7V1,---,;7Vn} is a basis of the real vector 
space V. If 
ayVvi t+) + anVn t+ byivy +++ + bnivn = 0 


with a1,...,@n,01,...,bn € R, then 
(ay + byt)vi +--+ + (dn + bnt)Vn = 0. 


Since the vectors vj,...,Vn are linearly independent in the complex vector 
space V, we must have 


aytbi= +++ =n + dni = 0, 
and thus a, + =A, = 01 --. = 6b, = 0. This proves that the vectors 
Vi,---;Vn,tV1,--.,%Vn are linearly independent in the real vector space V. 


Now we need to show that every vector v € V can be written in the form 
D1Vi +++ + BnVn + y1tvi +-++ + Yntivn 
with 1,...,2n,Yn,---;Yn € R. Indeed, if v € V, then 
V=2Vi +--+: +2nVn 
for some 2z1,...,2n € C. If we write z; = 2; + yj;2, then 


ZyVi +++ + 2nVn = (v1 ale yit)vi Sets (Gis ae Ynb)Vn 


i Oo is es Yitvy Petey YntVn; 


which is the desired representation. O 


Theorem 1.5.16. Let U be a subspace of a finite dimensional vector 
space V. Then 


(a) U is finite dimensional and dimU < dimV; 


(b) If dimU = dimYV, thenU = VY. 


52 Chapter 1: Vector Spaces 


Proof. Both properties are trivially true if U = {0}. 


If U is a nontrivial subspace of V and {vi,...,vn} is a basis of V, then 
there exist an integer m > 1 and linearly independent vectors v;,,...,Vi,, © 
{vi,---; Vn} such that 


U © Kvi, ®::-@ Kvi,, = VY, 
by Theorem 1.4.14. Since we also have 
Kv;, @--- ® Kvj,_,, ® Kvi, @--- ® Kvi,, = V, 


where {71,---,;Jn—-m;t1,---,4m} = {1,...,n}, the vector subspace U/ has a basis 
with n — m vectors, by Theorem 1.4.18. Consequently, 


dimU=n-—m<n=dimy. 


If U = VY, then obviously dim = dim V. In either case, since dimU < n, the 
space U/ is finite dimensional. This completes the proof of part (a). 


Part (b) is an immediate consequence of Theorem 1.5.7. Oo 


Example 1.5.17. Let U and V be finite dimensional vector subspaces of a 
vector space W. Show that 


dim(U + V) = dimU + dim Y — dim(&n VY). 
Solution. First we note that U/ + V is a finite dimensional vector space such 
that UU CU+V andVCU+Y. 
Let {X1,...,Xp} be a basis of UNV, {y1,...,¥q} be a basis of a complement 


UW of UNV in U, and let {z1,...,z,} a basis of a complement V’ of UN V in 
Y. According to Example 1.4.3 we have 


U+V=WUeUnvyeVv’ 
Consequently, by Theorem 1.4.10, 
Sige angeca WAln oad a sanvalnwonneaayy 


is a basis of U+V. Since dim(U+V) = pt+q+tr, dimU = p+q, dimV = p+r, 
and dim(U NV) = p, the desired equality follows. oO 


1.6. CHANGE OF BASIS 53 


We close this section with a useful observation that is an immediate conse- 
quence of Corollary 1.4.16. Note that the theorem implies that every collection 
of linearly independent vectors in a vector space V can be extended to a basis 
Vy. 


Theorem 1.5.18. Let {vi,...,Vn} be a basis of a vector space V. If the 
vectors W1,...,Wr € V are linearly independent and k < n, then there 
are vectors Wk41,-+-;Wn © {V1,---,Vn} such that 


{wi, se aa > Wk, Wk+1; rate Wnt 


is a basis of V. 


Example 1.5.19. In Example 1.3.26 we use the fact that the polynomials 


Ua) tae) ole), 


where q(t) = (t — a)(t — b)(t — c), are linearly independent vectors in P,,(K). 
Extend this collection of vectors to a basis of P,, (KK). 


Solution. We can use two vectors from the standard basis of P,,(K), that is, 
He sar rere sat 
namely t and #?. It is easy to check that the set 
Viet tae tg (teen eee 


is a basis in P,(K). O 


1.6 Change of basis 


Problems in linear algebra and its applications often require working with dif- 
ferent bases in the same vector space. If coordinates of a vector in one basis are 
known, we need to be able to efficiently find its coordinates in a different basis. 
The following theorem describes this process in terms of matrix multiplication. 


54 Chapter 1: Vector Spaces 


Theorem 1.6.1. Let {vi,...,vn} and {w1,...,wn»} be bases of a vector 
space V and let aj, € K, for 1 < j,k <n, be the unique numbers such 
that 

Vk = Q1kW1 +++ + OnkWn, 


for everyl1<k<n. For every v € Y, if 


V=21Vi +++: + 2nVn, 


then 


V=Y1W1 +++ + YnWn, 


where the numbers y1,.--,Yn are given by 


Proof. Let v be an arbitrary vector in V. If v= a,v1 +--+: +2%p,Vn, then 


V=H%1Vi +++ + 2nVm 


= 1(a11 Wi a a Gn1Wn) eile Lm(ainW1 a ao AnnWn) 


ry Beal 
= [ait 22a tia : wWwitees + [ant sical eae | : Wy. 
Zn In, 
Thus the coordinates of v in basis {wj,...,wn} are 
Beal 
YR = [ani aaa’ Akn| : |, 
In 
for every k € {1,...,n}, which can be written as 
Y1 Qi1 «++ Gin Ty 
Yn GQni --. Gnn| |2Ln 


Definition 1.6.2. Let B = {v1,...,vn}andC = {wi,...,w,} be bases 
of a vector space VY. The n x n matrix in Theorem 1.6.1 is called the 


change of coordinates matrix from the basis B to the basis C and is 
denoted by Idgp_sc. 


1.6. CHANGE OF BASIS 55 


We use Id in the above definition because in Theorem 1.6.1 we have 
Id(ve) = Ve = GinWi +-+++4nkWn, 


where the function Id : Y > V is defined by Id(v) = v for every v € V. This 
notation will be explained better in Chapter 2. 


Example 1.6.3. Let {v1, v2, v3} and {wj, we, w3} be bases for a vector space 
Y. If 


Vi= wy + iw + W3 
v2= —1Ww + iw +W3 
v3 =Wwit wwe + ws, 


write the vector v = (2—7)v1+3v2+iv3 as a linear combination of the vectors 
Wi, W2, and ws. 


Solution. Since the change of coordinates matrix from the basis {vj, v2, v3} 
to the basis {w1, w2, ws} is 


4 —i 1 
i al @ 
and 
4-1 1 2-1 1 
a vt 3) = 52], 
i ile a 4-7 
we have 


v = wy) + 5iwe + (4 — 7) ws. 


Example 1.6.4. Let a € K. We consider the vector space P3(K). Determine 
the change of coordinates matrix from the basis 


Clnunem ene 


to the basis 


and write an arbitrary polynomial from P3(K) in the basis {1,t — a, (t — a), 


(¢ —a)°}. 


56 Chapter 1: Vector Spaces 


Solution. Since 


l@ GG |[n a°b3 + a7b, + abe + bo 
0 12a 3a? by = 3.a7b3 + 2 abe + b; 
00 1 3a} }bo| — 3.ab3 + b2| ’ 
00 O 1] | 63 b3 


and thus p(t) becomes 


a?b3+a7b; +ab2+bo + (3a7b3 +2 abe +b1)(t—a) + (3 abs +b2)(t—a)? +b3(t—a)®, 


when written in the basis {1,t— a, (t — a)’, (t — a)3}. Note that we could get 
this result by direct calculations. O 


Theorem 1.6.5. Let B = {v1,...,Vn} andC = {wi,...,wn} be bases 
of a vector space V. The change of coordinates matriz Idg_,c is invertible 
and 


Igoe idee 


Proof. If 


T1V1i te $+ InVn = Y1W1 +°++ + YnWn, 


then 


=Idg+sce | : and : | =Idesp]: |, 


1.6. CHANGE OF BASIS 


and consequently 


Ly Y1 v1 
= Ides | : | =Ide+8 Idec 
In Yn In 
and 
Y1 Ty Y1 
= Idg-+c : = Idg-sc Ide-+z 


Un In Yn 


proving that iis ae = Ide_sz. 


57 


Example 1.6.6. Let {vi,v2,v3} and {wi,w2,ws3} be bases of the vector 


space V. If 


Vi= wy + iw + W3 
V2 = —tw, +71w2 + w3 


v3 =Wwit twee + ws, 


write Ww , W2, and wz3 as linear combinations of the vectors v1, v2, and v3. 


Solution. The change of coordinates matrix from the basis {v1, v2, v3} to the 


basis {w1, w2, w3} is 


4 —2 1 
i ak gy 
The inverse of this matrix is 
i 1 
= 0 z 
ene pean ca 
RH B 2 
ti 8 8 
VW g= 4 =2>2 
Consequently, we can write 
1 “i a 
wi, =-rtvi iy) 
2 Rg! 


58 Chapter 1: Vector Spaces 


Example 1.6.7. It is easy to calculate that the change of coordinates matrix 
from the basis 


to the basis 


ierte te 
is 
i =@ @? oF 
© id <8 Bor 
0 0 1 -3a 


0 0 0 il 


Consequently, the change of coordinates matrix from the basis 
ieee) 


to the basis 


is 


1 —-a az —a? ou lad @ 
O =f aor ae aa 
0 O 1 -3a 0°07 1 37a" 
0 O 0 il 00 O 1 


which is the matrix we found in Example 1.6.4. 


1.7 Exercises 


1.7.1 Definitions and examples 


Exercise 1.1. Let A = {(z,y,z): av € R®,2 > 0,z < 0}. Show that A is not a 
vector space. 


Exercise 1.2. Let A = {(z,y,z) € RB? : 2+y+2z=0} and B= {(z,y,z) € 
R? :2—y+z=0}. Show that AU B is not a vector space. 


Exercise 1.3. We define a vector space over R whose elements are lines in R? 
parallel to a given line. We recall that a line is a set of the form x + Ra = 
{x +ca:c€R} where x and a are vectors in R? anda 4 0. 

Let a be a fixed nonzero vector in R*. We define 


and 


1.7. EXERCISES 59 


Show that the set V = {x : x € R°} with the operations defined above is a real 
vector space. 


Exercise 1.4. We define a vector space over R whose elements are planes 
in R® parallel to a given plane. We recall that a plane is a set of the form 
x+Ra+Rb = {x+ca+db: c,d € R} where x,a and b are vectors in R? with 
a#Oandb#0. 

Let a and b be fixed nonzero vectors in R*. We define 


x —x+Ra+Rb 


and 
x+y=x+y and K=c%. 


Show that the set V = {x : x € R*} with the operations defined above is a real 
vector space. 


Exercise 1.5. Let V and W be vector spaces. Show that the set of all functions 
f:V—W with the operations of addition and scalar multiplication defined as 


(f + 9)(x) = f(x) + g(x) and (af)(x) = af (x), 


is a vector space. 


1.7.2 Subspaces 


Exercise 1.6. Let a,b € K. Write the polynomial (t + a)® as a linear combi- 
nation of the polynomials 1, t + b, (t + b)?, (t+ 6)°. 


Exercise 1.7. Let U, V, and W be subspaces of a vector space V. IfU Cc YV, 
show that VN (U+W) =U+ (VOW). 


Exercise 1.8. Let v1, v2, v3 be vectors in a vectors space V. If 


Wi =v 3V2 a i 5v3 


W2—V1i7TV27 3v3 


W3 >= V1 2v2 A4v3 


W4 = V1 — V2 7 V3 


Ws = 3V1 — vor 5V3, 
show that Span{w 1, w2} = Span{ws, wu, ws}. 


Exercise 1.9. Show that the set {p € P(R) : 1 p(t)t?dt = 0} is a vector 
subspace of P(R). 


Exercise 1.10. Let V be a subspace of Mnyn(K). Show that V7 ={A7: Ae 
V} is a subspace of Mn xn(K). 


Exercise 1.11. Let Vj,...,Vin be subspaces of a vector space V. If W is a 
vector space such that V; C W for every j € {1,...,m}, show that Vj +---+ 
Vin GW. 


60 Chapter 1: Vector Spaces 


Exercise 1.12. If U/ is a proper subspace of a vector space V, show that the 
following conditions are equivalent: 


(a) If W #U is a subspace of V andU CW CY, then W = V; 
(b) For every x € V which is not in U we have U + Kx =YV. 


Exercise 1.13. If / and W are subspaces of a vector space VY, show that the 
following conditions are equivalent: 


(a) The set U¢U W is a subspace of V; 
(b) UCWorWCU. 


Exercise 1.14. Let / and W are subspaces of a vector space V and A and 
B finite subsets of V such that SpanA = U and Spanb = W. Show that 
Span(AUB) =U+4+W. 


Exercise 1.15. Let U/ be a subspace of the vector space K. Show that U = {0} 
orU=K. 


Exercise 1.16. Show that the set of matrices from M,yx(K) which satisfy the 
equation 2A + A? = 0 is a subspace of Mn xn(K). Describe this subspace. 


Exercise 1.17. Give an example of subspaces U, V, and W of P3(IK) such that 
UAVanduU+W=V4W =P3(R). 


Exercise 1.18. Show that the set 


pays (ee 0 a 


is a subspace of My xn(K). 


Exercise 1.19. Verify that Mnxn(K) = UDnaxn(K) + Srxn(K), where 
UDnxn(K) is defined in Exercise 1.18. 


Exercise 1.20. If v,,v2,v3 are vectors in a vector space VY, show that 
Span{v1, V2, v3} = Span{7v1 + 3v2 — 4v3, v2 + 2v3, 5v3 — vo}. 


Exercise 1.21. Let a € K. Show that a polynomial p € P,,(K) is a linear 
combination of the polynomials (t—a)?,...,(t—a)” if and only if p(a) = p’(a) = 
0. 

Exercise 1.22. Let U be a subspace of a vector space V and let x and y be 


vectors in V such that x ¢ U. Show that if x € UY’ +Ky then y € U + Kx and 
yé¢u. 


Exercise 1.23. Let f,g € Fx(S). Write the function f as a linear combination 
of the functions 5f — 3g and 4f + 7g. 


Exercise 1.24. Show that the function |t| is not a linear combination of the 
function |t — ci|,...,|t — cn| where c1,...,¢, are nonzero real numbers. 


Exercise 1.25. Write the function sin(t + a) as a linear combination of the 
functions sint and cost. 


1.7. EXERCISES 61 


1.7.3. Linearly independent vectors and bases 


Exercise 1.26. Let U/ be the set of all infinite sequences (x1, x2,...) of real 
numbers such that %,42 = %p for every integer n > 1. Show that U is a finite- 
dimensional subspace of the vector space of all infinite sequences (x1, %2,...) of 
real numbers and determine a basis in U/. 


Exercise 1.27. Let U and W be subspaces of a vector space V such that 
UNW = {0}. Show that, if the vectors uj,...,Um € U are linearly independent 
and the vectors w1,...,Wn € W are linearly independent, then the vectors 
Uj,---,Um,W1,---,Wn are linearly independent. 


Exercise 1.28. If v1, v2, vg are linearly independent vectors in a vector space 
Y, show that the vectors vj + v2, V2 + v3, v3 + v1 are linearly independent. 


Exercise 1.29. If v1, v2,v3,v4 are linearly independent vectors in a vector 
space VY, show that the vectors v,; + v2, v2 + V3, V3 + V4, V4 + vi are linearly 
dependent. 


Exercise 1.30. Show that the functions 1, cos? t, cos 2¢ are linearly dependent 
elements in Dp(R). 


Exercise 1.31. Show that the matrices 


—1 3 1-21 1-2 9-21 
2 il’ 3 i’ 0 a 
are linearly dependent elements of M2x2(C). 


Exercise 1.32. Let {v1, v2, v3} be a basis of a complex vector space V. Show 
that the vectors 


Wi =i1vV1 + V2+ V3 

W2 = al + 1V2 + V3 

W3 > Vi + V2 + ivs 
are linearly independent. 


Exercise 1.33. If v,,..., Vv» are linearly independent vectors in a vector space 
Y, show that the vectors v3 +v1+V2,---,;WVn+V1+vV2 are linearly independent. 


Exercise 1.34. Find a basis of C? as a vector space over R. 


Exercise 1.35. Show that {cost,sint} and {cost + isint,cost — isint} are 
bases of the same complex vector subspace of Fic(R). 


Exercise 1.36. Let vi,...,V%,V,W1,.-.,Wm be elements of a vector space V. 
If the vectors vi,...,Vx,Vv are linearly independent and v € Span{vi,..., Ve, 
W1,---,Wm}, then there are integers j1,...jm—1 € {1,...,m} such that 


Span{vi,...,Ve,V,W;,,---,Wy,,_,} = Span{vi,...,Ve,W1,...,Wm}.- 


62 Chapter 1: Vector Spaces 


Exercise 1.37. Let W be a vector space and let v1,...,Vk,W1,---,Wm € W. 
Use Exercise 1.36 to show that, if the vectors v1,..., Vx are linearly independent 
in Span{wi,...,Wm}, then k < m. 


Exercise 1.38. Let vi,...,V, be vectors in a vector space V such that V = 
Span{vi,...,Vn}. Let & be an integer such that 1 < k < n. We suppose that the 
vectors V1,...,Vx are linearly independent and that the vectors v1,...,Vk,Vm 
are linearly dependent for all m € {k+1,...,n}. Show that {vi,...,vz} isa 
basis of V. 


Exercise 1.39. Let V be a subspace of a vector space W and let vi,..., Vz 
be linearly independent vectors in VY. Show that either {vi,...,vx} is a basis 
of Y or there is a vector v € V such that the vectors v,,...,vz, Vv are linearly 
independent. 


Exercise 1.40. Let V be a subspace of an m-dimensional vector space W. 
Using Exercises 1.37 and 1.39, show that V is a k-dimensional vector space for 
some k < m. 


Exercise 1.41. Let {vj, v2, V3, v4, v5} be a basis of a vector space V and let 


WwW, = Vv, + 38v2 + 8v3 4+ 2v4 + V5 


We = 2v1 + v2 + v3 + 4v44+ V5 


W3=— V1 + V2 + v3 + 2v44+ Vs. 


Determine all two-element sets {v;, vz} C {v1, V2, V3, V4, V5} such that the set 
{W1, W2, W3, Vj, Ve} is a basis of V. 


Exercise 1.42. Let {v1, v2, v3, v4} be a basis of a vector space V. If 


Wi = 3V1 bas Avo + 5V3 aha 2v4 


W2 = 2v1 a 5V2 + 3v3 Tr 4v4 
W3 = Vi + ave + bv3 


W4 = CV2. + dv3 + va, 


find numbers a, b,c,d such that {w1,w2} and {ws, w4} are bases of the same 
subspace. 


Exercise 1.43. Verify that the set 


:3a+y+24+4t(=0,2e7+y4+3z24+t=0 


eA) 
II 
exes 


is a subspace of R* and determine a basis of this subspace. 


Exercise 1.44. Show that V = {p € P2(R) : fe: p(t)dt = 0, J p(t)at =O0}isa 
vector subspace of P2(R) and determine a basis in this subspace. 


1.7. EXERCISES 63 


1.7.4. Direct sums 


Exercise 1.45. If U = {f € Fr(R) : f(-t) = f(t)} and V = {f © F(R): 
f(—t) =—f(t)}, show that Fg(R) =USV. 


Exercise 1.46. Let Dyyn(K) = {[ajn] € Mnxn(K) : ajn =O if 7 Ak}. Finda 
subspace V of Myxn(K) such that 


Anxn(K) ® Daxn(K) 8 V = Maxn(K). 


Exercise 1.47. Let U/ and W be vector subspaces of a vector space VY. If 
dimU + dim W = dim Y + 1, show that the sum U + W is not direct. 


Exercise 1.48. Let U/;,U2 and U3 be subspaces of a vector space V. Show that 
the sum U4, + U2 +Us is direct if and only if U4; U2 = 0 and (UU, +U2)NU3 = 0. 


Exercise 1.49. Let U4, and U2 be subspaces of a vector space V and let 
Vi,---;Wn € V. We assume that the sum 


U,4+U%.4+ Kv, +---+Kv, 


is direct. If Wy =U, + Kv, +---+Kv, and W2 =U) + Kv, +---+Kv,,, show 
that {vi,...,Vn} is a basis of Wy N Wo. 


Exercise 1.50. Let A, B, and S be sets such that A and B are disjoint and 
AUB=S. If V={f € Fx(S): f(x) =0 for all x € A} andW ={f € Fx(S): 
f(x) = 0 for all « € B}, show that V and W are subspaces of Fx(S') and that 
Fx(S) =VEW. 


Exercise 1.51. Show that Mnxn(K) = Anxn(K) ®@UDnxn(K). 


Exercise 1.52. Let x1,...,2, be distinct elements of a set S. Show that the 
set U = {f € Fx(S): f(a1) =--- = f(an) = 0} is a vector subspace of Fx(S) 
and determine a subspace W of Fix(S) such that 


U OW = Fx(S). 


1.7.5 Dimension of a vector space 


Exercise 1.53. Let po,..., Pn be polynomials in P,,(IK) such that degp; = j 
for every j € {0,...,n}. Show that {po,...,pn} is a basis of P,(K). 


Exercise 1.54. Show that the dimension of the vector space VY in Exercise 1.4 
is l. 


Exercise 1.55. Show that the dimension of the vector space V in Exercise 1.3 
is 2. 


Exercise 1.56. Show that 


cor (EPI Ep 


is a basis of the vector space Mo2x2(K). 


64 Chapter 1: Vector Spaces 


Exercise 1.57. Determine 3 different bases in the vector space M2x2(KK) which 


: 1 2} |3 0 
are extensions of the set { E A , f 7 \ 


Exercise 1.58. Show that the set of all matrices of the form ; ik where 
a,b € K are arbitrary, is a subspace of Mo2y2(KK) and determine the dimension 
of this subspace. 

Exercise 1.59. Show that the set of all matrices of the form E Hr where 


a,b,x,y € K and 22 + 5y = 0, is a subspace of M2,2(K) and determine the 
dimension of this subspace. 


Exercise 1.60. Show that the set 
a b 
u= {2 i € Mo2x2(K) : a+b=0c+d=0,a+d=0} 


is a vector subspace of Mo2x2(KK) and determine dimU. 
Exercise 1.61. Determine dim S3x3(K). 


Exercise 1.62. IfU/,,...,U, are subspaces of an n-dimensional vector space V, 
show that dim?/, + ---+diml& < (k —1)n+dimUyn--- NU. 


Exercise 1.63. Let U/ and W be vector subspaces of a vector space V where 
dimV = n. If {0} AU ¢ W, dimU = m, and dimW = n — 1, determine 
dimun Ww. 


Exercise 1.64. Let {v1,V2,v3,v4} be a basis of a vector space V and let 
W1, W2, W3 € V be linearly independent. Show that at least one the sets 


{W1, W2, W3, Vi}, (Wi, W2, W3, V2}, {W1, W2, W3, V3}, {W1, W2, W3, Va} 
is a basis of V. 


Exercise 1.65. Let {vi, v2, v3, v4} be a basis of a vector space V. If 


Wi =v 3V2 +r 4v3 — V4 


We = 2v; + v2 + 8v3 4+ 3v4 


W3 = Vi + V2 + 2v3 4 Va, 
determine the dimension of Span{w1, w2, w3} and find a basis of this subspace. 


Exercise 1.66. Show that the set U = {p € Pn(K) : p(1) = p’'”(1) = Of isa 
subspace of P,, (IK). Determine the dimension of U/, find a basis of U/, and extend 
this basis to a basis of P,(K). 


Exercise 1.67. If V = {p € P,(K) : p(1) = p(2) = 0} and W 


=p 
Pr(IKK) : p(1) = p(3) = O}, describe V + W and verify that dim(V + W) 
dim V + dim W — dim(V NW). 


ia 


1.7. EXERCISES 65 


Exercise 1.68. Let {v1,v2,Vv3, V4, v5} be linearly independent vectors in a 
vector space V and let wi, w2, w3 € Span{vi, v2, V3, V4, V5}. We assume that 


Wi = 41V1 + G2gV2 + A3V3 + A4V4 + A5V5 


W2 > byvy ag bove + b3v3 + b4av4 + bsV5 


W3 = C1 V1 + CoVe2 + C3V3 + C4V4 + C5 V5 
and that the reduced echelon form of the matrix 


a, a2 43 G4 a5 
by by a3 b4 bs 
Cy C2 C3 C4 C5 


is 


oor 


t002 
010 y 
001 2 
Show that {w1, we, ws, V2, vs} is a basis of Span{v, v2, v3, V4, V5}. 


Exercise 1.69. Show that dimMyxn(K) = dimUDyyn + dimSyxn(K) — 
dim Dnxn(K). 


Exercise 1.70. Show that the set U = {p € P3(K) : p(1) = p(2) = Of isa 
subspace of P3(KK), determine the dimension of U, find a basis of /, and extend 
that basis to a basis of P3(IK). 


Exercise 1.71. Let V = {p € P,(K) : p(1) = p(—-1) = 0} and W = {pe 
Pn(K) : p(t) = p(—i) = 0}, where n > 4. Show that V and W are subspaces of 
P,.(K), V+ W = Pp(K), and dim(V + W) = dimV + dimW — dim(V NW). 


1.7.6 Change of basis 


Exercise 1.72. Find the change of coordinates matrix from {w1, w2} to {ws, 
w.} defined in Exercise 1.42. 


,t?(t—a)} and {t—a, (t—a)?, (t—a)?} 
(KK) : p(a) = 0} and find the change of 
(t — a)?} to {t —a,t(t — a), t(t — a)?}. 


Exercise 1.74. Let {vi, v2, v3, v4} be a basis of a vector space V. If 


Exercise 1.73. Show that {t—a,t(t—a) 
are bases in the vector subspace € Py 
coordinates matrix from {t — a, (t — a), 


WwW, = Vv; + 2v2 + 4v3 4+ V4 
W2 = 2vi + 4vo + 7v3 + 3V4 


w3=Vv+ 2v2 Tr 5v4 


W4 = V3 — V4, 


show that {w1,w2} and {w3, w.4} are bases of the same subspace and find the 
change of coordinates matrix from {w3, wa} to {w1, w2} and from {wi, w2} to 
{w3, wa}. 


This page intentionally left blank 


Chapter 2 


Linear Transformations 


Introduction 


When limits are introduced in calculus, one of the first properties of limits that 
we learn is 


lim (f(x) + g(x)) = lim f(x) + lim +g(x) and lim ef(x) = ¢ lim f(c), 


«wa «wa «wa 


where c is an arbitrary constant. These properties are then used in a more 
general version, namely, 


lim (ci fi (@) ape Catal®)) = C1 jim fi(2) ttt + En jim fn({z) 


where cj,...,Cpn are arbitrary constants. Then we see a similar property for 
derivatives 


(cs fila) +--+ enfin a) = ere fala) +2 + On fala) 


and integrals 


[ane +++) +n fn(x)) dx = cy i filv) dz +--+ +n [ fala) de. 


This property is referred to as linearity. 

When matrix multiplication is introduced in an introductory matrix linear 
algebra course, again one of the first properties we mention is linearity of matrix 
multiplication. 

Linearity of functions between vector spaces is one of the fundamental ideas 
of linear algebra. In this chapter we study properties of linear functions in the 
abstract setting of vector spaces. 


67 


68 Chapter 2: Linear Transformations 


2.1 Basic properties 


Definition 2.1.1. Let V and W be vector spaces. A function f : ¥V > W 
is called linear if it satisfies the following two conditions 


(a) f(x+y) = f(x) + f(y) for every x,y € V; 


(b) f(ax) = af (x) for every x € V and every a € K. 


Linear functions from V to W are called linear transformations. 


Note that, if f : V — W is a linear transformation, then 
f(aix1 +++ + a5x;) = arf (x1) +--+ + ay f(x) 


for any vectors x;,...,x; € V and any numbers q1,...,a; € K. 


Proposition 2.1.2. If f:V— W is a linear transformation, then 


Proof. To prove (a) it suffices to note that 


f(0) = f(0 + 0) = f(0) + f(O) 


and to prove (b) it suffices to note that 


0 = f(0) = fv —v) = fv) + F(-W). 


Example 2.1.3. Let V be a vector space. Show that the function Id: ¥V > V 
defined by Id(x) = x is linear. 


Solution. We have 
Id(x+y) =x+y =Id(x) + Id(y) 
for every x,y € V, and 
Id(ax) = ax = ald(x) 


for every x € V and every a € K. Oo 


2.1. BASIC PROPERTIES 69 


Example 2.1.4. Let V and W be vector spaces. If f: V > Wandg: V4 W 
are linear transformations, then the function f + g : V + W defined by 


(f + 9)(x) = F(x) + g(x), 


for every x € Y, is a linear transformation. 


Solution. For any vectors x,y € V and any number a € K we have 


(f+g)(xt+y)=fxty)t+g9x+y) 
= f(x) + f(y) + g(x) + oly) = f(x) + 9(x) + fly) + oly) 


a= ie fe oly) 
and 
(f + g)(ax) = f(ax) + g(ax) = af(x) + ag(x) 
= a(f(x) + 9(x)) = a(f + 9)(%). 
This means that f + is a linear transformation. O 


Example 2.1.5. Let V and W be vector spaces. If f : V ~ W is a linear 
transformation and a € K, then the function af : ¥V — W defined by 


(af) (x) = af x) 


for every x € V, is a linear transformation. 


Solution. For any vectors x,y € V and any number £ € K we have 


(af)(x+y) =a(f(x+y)) = a(f(x) + fly) 
= af (x) taf(y) = (af)(x) + (ef)(y) 


and 


(af)(Gx) = a(f(8x)) = a8 f(x) = Baf(x) = B(af)(x). 


This means that af is a linear transformation. O 


The proof of the following important result is a consequence of the defini- 
tions. 


70 Chapter 2: Linear Transformations 


Theorem 2.1.6. Let V and W be vector spaces. The set of all linear 
transformations from V to W with the operations f +g andaf defined 
as 


(f + 9)(x) = F(x) + g(x) and (af)(x) = af(x) 


is a vector space. 


Definition 2.1.7. The vector space of all linear transformations from a 
vector space V to a vector space W is denoted by L(V, W). 

A linear transformation f : V — V is also called an operator or an 
endomorphism. The vector space L(V, V) is often denoted by L(V). 


Example 2.1.8. If A € Mnxm(K), show that the function f : K"™ > K” 
defined by f(x) = Ax is a linear transformation. 


Solution. From properties of matrix multiplication we get 


f(xt+y) =A(x+y) = Ax + Ay = f(x) + f(y) 


and 
f(ax) = A(ax) = aAx = af (x). 


This means that f is a linear transformation. O 


Example 2.1.9. Let V,,...,V, be subspaces of a vector space V. Show that 
the function f : Vy x--- x Vy, > V defined by f(x1,...,Xn) = x1 +:::+Xn is 
a linear transformation. 


Solution. The proof is an immediate consequence of the definition of a linear 
transformation. Oo 


Another important property of linear transformations is that the composi- 
tion of linear transformations is a linear transformation. 


Theorem 2.1.10. Let V, W, and X be vector spaces. If f : V+ W and 
g:W-X are linear transformations, then the function go f:V>¥ 


is a linear transformation. 


In other words, if f € L(V,W) and g € LW, *), then go f € L(V, #). 


2.1. BASIC PROPERTIES 71 


Proof. Ifx,y € V and a € K, then 


go f(x+y) =9(f(x+y)) = 9(f(x) + f(y) 
= 9(f(x)) + 9(f(y)) =9° f(x) +90 fly) 


and 


9° f(ax) = g(f(ax)) = g(af(x)) = ag(f(x)) = ago f(x). 


Example 2.1.11. Let f: K— Kandg:K-— K be the linear transformations 
defined by f(a) = ax and g(a) = Bx, where a and @ are numbers from K. 
The linear transformation go f is defined by go f(x) = (a@)x. Note that in 
this case go f = f og. This equality is not generally true. 


The above example can be generalized as follows. 


Example 2.1.12. Let f : K” — K” and g: K" — K? be the linear transfor- 


mations defined by 
f(x) =Ax and g(y) = By, 


where A € Myym(K) and B € My x»(K). Then 
g° f(x) = (BA)x 


for every x € K™. 


In linear algebra it is customary to write the composition f og simply as fg 
and call it the product of f and g. Note that, if f and g are defined in terms of 
matrices, as in Example 2.1.12, then the product of f and g corresponds to the 
product of matrices. 

Composition of linear transformations has properties similar to multiplica- 
tion. The main difference is that composition is not commutative, that is, in 
general fg is different from gf. Moreover, if fg is well-defined, it does not mean 
that gf makes sense. 


Proposition 2.1.13. Let V and W be vector spaces, let f, f’: V+ W 
and g,g': W- & be linear transformations, and let a€ K. Then 


(ar or PySoy as : 
(b) (g+9)f=gft+9'f; 
(c) (ag)f = g(af) = a(gf). 


72 Chapter 2: Linear Transformations 


Proof. The properties are direct consequences of the definitions. The proof is 
left as an exercise. O 


The following theorem implies that a linear transformation f € L(V,W) 
is completely determined by its values at elements of an arbitrary basis of V. 
This is very different from arbitrary functions from V to W and has important 
consequences. 


Theorem 2.1.14. Let V and W be vector spaces and let {vi,...,Vn} 
be a basis of V. For any wi,...,Wn € W there is a unique linear trans- 
formation f :V— W such that 


f(vi) =wi, .--; f(Vn) = Wn. 


Proof. Since {v1,...,Vn} is a basis of V, for every x € V there are unique 
numbers 71,...,%» € K such that x = 71v1 + ---+2%nVn. We define 


f(x) = f(@ivi +++) + anVn) = 21W1 +°++ + on Wy. 
Since for any a € K we have 
ax = a(a@1Vv1 +++ + 42nVp) = (a%1)v1 +--+ + (A2n)Vn, 
we also have 
flax) = (ar1)wy + +++ 4+ (aan) Wy = a(aywy +++ + 2nWn) = af (x). 
Now, if x,y € V, then 
X=@1Vit-++2nVn and y=yivit-:++YnVn, 
for some numbers 21,...,2n,Y1,---;Yn € K. Since 
X+y =21Vi te + onVn tyiVi te + YnVn = (@1+y1)Vi t+ + (Gn +Yn)Vn, 
we get 
f(xty) = (a1 +y1)wi +--+ + (tn + Yn) We 
= 2ywy t+) + 0KWn + y1Wi +°°- + YnWn 
= f(x) + f(y). 


This shows that the defined function f is a linear transformation. Clearly, 


f(vi) =Wi,--., f(vn) = Wr. 
Now we need to show that defined function f is a unique linear transfor- 


mation such that f(v1) = wi,...,f(Vn) = Wn. Let g be any linear trans- 
formation such that g(vi) = wi,..-,g(¥n) = Wn and let x € VY. Then 
X= 21V, +--+: +4nVp for some 71,...,%, € K and we have 


g(x) = g(t1v1 + +++ + 2nVn) = 21g(V1) + +++ + Lng(Vn) 
= 27|W feeet LyWn = f(xivi ferret LnVn) = f (x). 


This proves the uniqueness and completes the proof. O 


2.1. BASIC PROPERTIES 73 


2.1.1 The kernel and range of a linear transformation 


Definition 2.1.15. Let f : V— W be a linear transformation. The set 


ker f ={x eV: f(x) =0} 


is called the kernel of f. 


Example 2.1.16. Consider an m xn matrix A with entries in K and the linear 
transformation f : K” — K™ defined by f(x) = Ax. Then ker f = N(A) (see 
Example 1.2.8). 


Theorem 2.1.17. Let f : V > W be a linear transformation. Then 


ker f is a subspace of V. 


Proof. If v € ker f and a € K, then 
flav) = af(v) = 0, 
and thus av € ker f. Similarly, if v1, v2 € ker f, then 
Pi Va) =F va) FF ve) = 0, 


and thus v1 + v2 € ker f. oO 


Example 2.1.18. Consider the linear transformation f : P2(IR) > R defined 
by f(p) = in p(t)dt. Determine ker f and dim ker f. 


Solution. An arbitrary element of P2(R)) is of the form at? + bt + c where a, 
b, and c are real numbers. Since 


1 
b 
i: (at? + bt +e)dt=—+-—+¢, 
: 373 


f (at? + bt +c) =0 if and only if ¢ + g +c =0 or, equivalently, ce = —} — 
Consequently, f (at? + bt + c) = 0 if and only if 


b 1 1 
of +6 tena +u-$—F=a(P-2) 40(e-3), 


Nilo 


74 Chapter 2: Linear Transformations 


Hence 


ker f = Span {¢? ~ 5 r~3} 


and dimker f = 2. Oo 


Example 2.1.19. Let f: VY — V be a linear transformation such that 


(f — ald)(f — 61d) =0 


(where Id : V — VY is the identity linear transformation) and a, 8 € K with 
a # B. Show that 


VY = ker(f — ald) 6 ker(f — 61d). 
Solution. First we note that 
f —ald—f — 6Id=(6-a)ld. (25) 
Consequently, for any v € V, we have 
C= old) i Bld ve (5 ov 
and thus 


ee Fa lf - aly Fai - Bla. 


Since (f — ald)v € ker(f — GId) and (f — GId)v € ker(f — ald), we have 


VY = ker(f — ald) + ker(f — 61d). 


To finish the proof we have to show that the sum is direct. Indeed, if v € 
ker(f — aId) and v € ker(f — 61d), then v = 0 by (2.1). Oo 


Definition 2.1.20. Let f :V— W bea linear transformation. The set 


ran f ={f(x):xeEV} 


is called the range of f. 


In other words, if f : V — W, then ran f is the set of all y € W such that 
y = f(x) for some x € V. We can also write ran f = f(V). 


2.1. BASIC PROPERTIES 79 


Theorem 2.1.21. Let f : V > W be a linear transformation. Then 


ran f is a subspace of W. 


Proof. If w € ran f, then w = f(v) for some v € V. Then for any a € K we 
have 


aw = af(v) = f(av), 


so aw € ran f. Similarly, if wi,w2 € ran f, then wi = f(v1) and we = f(vo) 
for some v1, V2 € V and we have 


Wi t+ Wo = f(vi) + f(ve) = f(vi + v2), 


SOW, + We €ranf. O 


Example 2.1.22. Consider an m xn matrix A with entries in K and the linear 
transformation f : K” > K”™ defined by f(x) = Ax. Show that ran f = C(A) 
(see Example 1.2.7). 


Solution. Let A = [ar sae an| where aj,..., a, are the columns of the matrix 
A. Then 
ZY 
tant ={Ax ise) = (aid eo td | ek p= Ci Ay 
In 
O 


Theorem 2.1.23. A linear transformation f : V > W is injective if 


and only if ker f = {O}. 


Proof. Since f(0) = 0, if f is injective, then the only v € V such that f(v) = 0 
is v = 0. This means that ker f = {0}. 
Now assume ker f = {0}. If f(vi) = f(ve2) for some vi, v2 € V, then 


f(vi — v2) = f(v1) — f(v2) = 9. 


So vi — v2 € ker f = {0}, which means that v; — v2 = 0 or vj = vo. This 
shows that f is injective. O 


76 Chapter 2: Linear Transformations 


2.1.2 Projections 


Projections on subspaces play an important role in linear algebra. In this section 
we discuss projections associated with direct sums. In Chapter 3 we will discuss 
orthogonal projections that are a special type of projections discussed in this 
section. 

Recall that, if V is a vector space and U and W are subspaces of V such that 
V=UEW, then for every v € VY there are unique u € U and w € W such that 
v=u-4w. This property is essential for the following definition. 


Definition 2.1.24. Let V be a vector space and let U and W be sub- 
spaces of V such that V = U@W. The function f : V > V defined 
by 


flu+w) =u, 


where u €U and w € W is called the projection on U along W. 


Note that, if f is the projection on U along W, then f? = f and Id—f is 
the projection on W along U. Indeed, if uc U and w € W, then 


fP(u+w) = f(f(utw)) = f(a) = f(ut+0) =u= f(ut+w) 


and 
(id—f)(u+w) =u+w- f(u+w) =u+w-u=w. 


Example 2.1.25. Let V be a vector space and let f : V > V be a linear 
transformation such that f? = f. Show that 


V=ranf @ ker f 


and that f is the projection on ran f along ker f. 


Solution. For any v € V we have 


and 
Fv — f(v)) = flv) — FF (v)) = flv) — flv) = 0. 


Since f(v) € ranf and v — f(v) € ker f, this shows that V = ran f + ker f. 
We need to show that this sum is direct, that is, that the only vector that is 
in both ran f and ker f is the zero vector. 


2.1. BASIC PROPERTIES 77 


Suppose v € ran f and v € ker f. Since v € ran f, there is a w € V such 
that f(w) =v. Then 


because v € ker f. 
Clearly, f is the projection on ran f along ker f. O 


Example 2.1.26. Let V be a vector space and let fi,..., fn : YV > V be linear 
transformations such that f;f; = 0 for 7 € k and fi+---+ fn = Id. Show 
that 


(a) The linear transformations f),..., fn are projections; 
(b) V=ran fi @---@ran fy; 


(c) For every j € {1,...,n} the transformation f; is the projection on ran f; 
along ran f; ®--- @®ran f;_1 Gran fj41 O--: Pran fy. 


Solution. Let v be an arbitrary vector in V. Since 
se); 
for every j € {1,...,n} we have 
TY) — Tay) ei falv) = Faia) 


This shows that f;f; = f; and thus, by Example 2.1.25, f; is the projection 
on ran f; along ker f;. 
To prove (b) we first note that, since v = fi(v) +--- + fn(v) for every 
v € V, we have 
VV en): 


We need to show that this sum is direct. If 
fi(v) +--+ + fn(v) = 90, 


then 
fifitv) +---4+ fifn(v) = f;(0) = 0. 


On the other hand, since f; f;, = 0 for 7 Ak, we have 


fifi(vy) +--+ + fyfalvy) = fi filv) = filv) 


and consequently fj(v) = 0 for every j € {1,...,n}. This shows that the sum 
fi(V) +---+ fr(V) is direct. 


78 Chapter 2: Linear Transformations 


Finally, to prove (c) we take a v € ker f;. Then we have 


Sot are Me) 
SV a ea) iY) 


and thus 
VE AV) ®:--@ fj-1V) @ f41(V) @--- @ falV). 


On the other hand, since f; f;,, = 0 for 7 # k, every v € f1(V)@---@ f;-1(V) ® 
fi4i(V) @---@ fn(V) is in ker f;. Consequently, 


ker fj = fi(V) ®--- © fj-1V) © fj41V) ®--- ® falV), 


completing the proof, by Example 2.1.25. O 


2.1.3. The Rank-Nullity Theorem 


The main result of this section is an important theorem that connects the di- 
mension of the domain of a linear transformation with the dimensions of its 
range and the subspace on which the transformation is zero, that is, the kernel 
of the transformation. We start with an example that will motivate the result. 


Example 2.1.27. Let f : ¥V — W be a linear transformation. If {v1, v2, v3} 
is a basis of ker f and {wi, we} is a basis of ran f, show that dim V = 5. 


Solution. For any v € Y there are 21,22 € K such that 
f(v) = a@1wi + rowe. 
If u, and ug are vectors in VY such that f(ui) = wi and f(uz2) = wo, then 
f(v) = 21 f(u1) + t2f(u2) = f(r1u1 + £2u2) 


and thus 
fv Ss (a1 U1 + x2QU2)) — Q: 


This means that v — (x, u, + Z2U2) € ker f and consequently there are yj, yo, 
y3 © K such that 


V — (aU + ©QU2) = yiVi + Yyov2 + Y3V3s 


V = £10, + LQU2 + Y1V1 + Y2V2 + Y3V3- 


2.1. BASIC PROPERTIES 79 


Since v is an arbitrary vector in Y, this shows that Span{uy, ue, v1, v2, v3} = 
Y. To finish the proof we have to show that the vectors uj, U2, V1, V2, V3 are 
linearly independent. To this end suppose that 


x1Uy + QU2 + yivi + yove + y3v3 = O. (2.2) 


Then 
rif (ur) + t2of(u2) + yi f(v1) + yof (ve) + ysf(v3) = 0 


and consequently 
21W, + r2W2 = O, 


which gives us 41 = x2 = O, because the vectors w, and wy» are linearly 
independent. Now, since 71 = x2 = 0, equation (2.2) becomes 


Y1V1 + Yy2v2 + y3v3 = O, 
which gives us yi = y2 = y3 = 0, because the vectors v1, v2, v3 are linearly 


independent. O 


It turns out that the property of the linear transformation in the example 
above holds for all linear transformations. 


Theorem 2.1.28 (Rank-Nullity Theorem). Let V be a finite dimen- 
sional vector space and let f : V— W be a linear transformation. Then 


dim ker f + dimran f = dim Y. 


Proof. The proof is a generalization of the argument presented in Example 
2.1.27. Let {vi,...,Vm} be a basis of ker f and let {wi,...,w,} be a basis of 
ran f. Then there are u;,...,u, € V such that f(u;) = wy, for 1 <j <n. 

If v € V, then there are 71,...,%, € K such that 


f(v) = Wy +--+ 2_pWy = wif (ui) a ea Inf (Un) = f(x a a LnUn) 


and thus 
f(v— au; +-+-+2,u,) = 0. 


Consequently v — x,u, +--+: + 2p,Up, € ker f and 
V=H QU te + L_nUn + Y1V1 +°++ + YmVin 


for some 41,.--,Ym € K. This shows that Span{uy,...,Un,V1,---;Vm} = V. 
To finish the proof we have to show that the vectors u,,...,Un,V1,-..,Vm are 
linearly independent. Suppose that 


TU +o + FAUn + Y1V1 + +++ + YmVm = O. (2.3) 


80 Chapter 2: Linear Transformations 


By applying f to the above equation and using the fact that {vi,...,vm} isa 
basis of ker f we obtain 

TyW, +--+ +%p,Wy =O, 
which gives us 4} =--: = Z, = 0, because the vectors w1,...,W» are linearly 
independent. Now equation (2.3) reduces to 

Yyivi +++ +YmVm = O, 


which gives us yj = --- = Ym = O in view of linear independence of vectors 
Vigees eV O 


The above theorem is called the Rank-Nullity Theorem because the number 
dimran f is called the rank of f and the number dimker f is called the nullity 
of f. 


Example 2.1.29. Let f : P5(R) — P5(R) be the linear transformation defined 
by f(p) = p’”. Determine ker f, dimker f, ran f, dimranf, and verify the 
Rank-Nullity Theorem. 


Solution. If p!’ = 0, then p(t) = at? +bt+c for some a,b,c € R. Consequently 
ker f = Span{1,t, t?} and dim ker f = 3. On the other hand, since 


(ast? t aat* t a3t? t aot? + ayt ag)” = 60ast? + 24a4t + 6a3, 


ran f = Span{1,t,t?} and dimran f = 3. As stated in the Rank-Nullity The- 
orem, 


dim ker f + dimran f = 6 = dim?P;(R). 


Example 2.1.30. Let f : P5(IR) — R? be the linear transformation defined 
i 
by f(p) = ell . Determine ker f, dimker f, ran f, dimran f, and verify the 


the Rank-Nullity Theorem. 


Solution. If f € Ps(IR) and p’(5) = p(5) = 0, then 
p(t) = (at® + bt? + ct + d)(t — 5)? 
for some a,b, c,d € R. Consequently 


ker, =Spanie S)1G- 5) fe 8) ee 


2.2. ISOMORPHISMS 81 


and dimker f = 4. On the other hand, since 
A) ae il 0 
sto) = [Fe] = 2°) [0] +2) [1] 
1 0 2 . . 
ran f = Span ol? la = R* and dimranf = 2. As stated in the Rank- 
Nullity Theorem, 
dim ker f + dimran f = 6 = dimP;(R). 


O 


The following simple consequence of the Rank-Nullity Theorem is often 
used. 


Corollary 2.1.31. [f f : V > K is a nonzero linear transformation, 


then dim ker f = dim VY — 1. 


2.2 Isomorphisms 


Consider the vector space P,,(KK) of all functions p: K > K of the form p(t) = 
ao + ajt +--+ +a,t” where ao,a1,...,@, € K. Since the polynomial p(t) = 
ao + a;t+---+a,t” is completely determined by the numbers ao, d@j,...,@n, 
one could say that the space P,,(K) can be “identified” with the space K"*!. We 
expect the vector spaces P;, (IK) and K"*! to have the same algebraic properties. 
We could say that, from the point of view of linear algebra, P,,(K) and K"*? are 
two “representations” of the same vector space. This point of view is important 
in linear algebra. In this section we will make this idea precise and examine 
some of its consequences. 


Definition 2.2.1. Let V and W be vector spaces. A linear transfor- 
mation f : V — W that is both injective and surjective is called an 


isomorphism of vector spaces or simply an isomorphism. Vector spaces 
VY and W are called isomorphic if there is an isomorphism f : V > W. 


Example 2.2.2. The vector spaces P,,(K) and K"*! are isomorphic. Indeed, 


82 Chapter 2: Linear Transformations 


the function 


feet at. + a,t)= 
an 


is an isomorphism from P,,(K) onto K"*?. 


Theorem 2.2.3. Let V and W be vector spaces. If f : V+ W is an 


isomorphism, then its inverse f—'!:W — V is a linear transformation. 


Proof. Let w€ W anda € K. Since f is surjective, w = f(v) for some v € V. 
From linearity of f we get 


f-'(aw) = f"(af(v)) = f"(f(av)) = av = af~"(w). 


Now, let wi, w2 € W. Since f is surjective, w, = f(vi) and wo = f(v2) for 
some Vj, V2 € V and, from linearity of f, we get 


f7' (wi + we) = f-*(f (v1) + f(v2)) = fF (vi + v2) 
=vitve = fo (wi) + f7* (wo). 


O 


Corollary 2.2.4. The inverse of an isomorphism is an isomorphism. 


Since the function f in the definition of an isomorphism maps V onto W, it 
may seem that the role of V in the definition is different from the role of W, 
but in view of the above corollary we know that it is not the case. 

In the next theorem we characterize isomorphisms in terms of bases. 


Theorem 2.2.5. Let V and W be vector spaces and let {v1,...,Vn} be 


a basis of V. A linear transformation f : V — W is an isomorphism if 
and only if the set {f(vi),..-,f(Wn)} is a basis of W. 


Proof. Assume that {f(vi),...,f(vn)} is a basis of W. We first show that 
f is injective. If f(aivi +--+: +2nVn) = O for some 21,...,2%, € K, then 
tif(vi) +--+ +anf(Vn) = 0 and thus 7; = --- = x, = 0. This means that 
ker f = O and consequently f is injective, by Theorem 2.1.23. 


2.2. ISOMORPHISMS 83 


To show that f is surjective we consider an arbitrary w © W. Then w = 
aif(vi) +--+: +anf(vn) for some 71,...,0 € K. Since 


w=aif(vi)+::++2nf(Vn) = f(tivi +++: +2nVn); 


we have w € ran f. 


Now we assume that f is an isomorphism. If w € W, then there is v € V 
such that f(v) = w. Since v = 21v1 +--+ +%nVn for some £1,...,%n € K, we 
have 


w= f(v) = f(tivit::++2¢Vn) = a1 f(vi) +--+ + anf (vn). 
This shows that Span{f(v1),...,f(vn)} = W. To show that the vectors 


f(vi),.--,;f(Wn) are linearly independent we suppose that 2, f(vi) + +--+ 
Inf (vn) = 0 for some 21,...,%, € K. Then f(a1vi +---+a2nV,) = 0 and thus 


U1V1 +++: +2nVn = 0, because ker f = 0. Since {vi,...,Vn} is a basis, we 
conclude that 7; = --- = x, = 0. Consequently, the set f(vi),...,f(vn) isa 
basis of the vector space W. O 


As an immediate consequence of the above theorem we obtain the following 
important result. 


Corollary 2.2.6. Finite dimensional vector spaces V and W are iso- 


morphic if and only if dim V = dim W. 


Example 2.2.7. Let V be a finite dimensional vector space and let U, W, 
and W, be subspaces of V such that V =U GW, =U GWs. Then dimW,; = 
dim W2, by Theorem 1.4.18. Consequently, the vector spaces W, and W2 are 
isomorphic. 


Isomorphic vector spaces have the same algebraic properties and, as we men- 
tioned at the beginning of this section, from the point of view of linear algebra, 
isomorphic vector spaces can be thought of as different representations of the 
same vector space. The following corollary says that every vector space over K 
of dimension n is basically a version of K”. 


84 Chapter 2: Linear Transformations 


Corollary 2.2.8. Let {v1,...,vn} be a basis of a vector space V. The 
function f :V — K” defined by 


f(tivi ++++ + 2nVn) = 


for all 21,...,%p € K, is an isomorphism. 


Proof. We have 
f(v1) =E1, +--+, f (Vn) =En; 


where e€1,...,@, is the standard basis of K”. 


Example 2.2.9. Let V,,...,V, be subspaces of a vector space V. Show that 
the function f : Vy x --- x Vz, > V defined by 


f(vi,---;Vn) =Vit-++ + Vn 
is an isomorphism if and only if 
ViG-::OVn =V. 
Solution. The function f is clearly a linear transformation. 


If f is an isomorphism, then ran f = V and ker f = {0}. Since ran f = V, 
we have Vj + ---+V, = V. Since ker f = {0}, vi +---+v, = O implies 


Vi, =-+-+: =Vn = 0, which means that the sum Y, +---+ V, is direct. 
Now we assume that V; ®---@ Vp, = VY. Then ran f = V and ker f = {0}, 
so f is an isomorphism. | 


Example 2.2.10. Let VY be an arbitrary vector space. Show that V and 
L(K, VY) (the vector space of all linear transformations f : K > V) are isomor- 
phic. 


Solution. For v € V let ty : K > V be the function defined by 
ty (0) EN 


Note that t, € £(K,V). We will show that f : VY — £(K,V) defined by 


2.2. ISOMORPHISMS 85 


is an isomorphism. 
It is easy to verify that f is a linear injection. We need to show that f is 
a surjection. Consider an arbitrary s € £(K,V). Then 


5 (Cis (al) F— ers (18) 


If we let v = s(1), then have s = t, = f(v). Consequently, ran f = £(K, V). 
Note that, as a particular case, we get that K and £(K,K) = C(K) are 
isomorphic. O 


We close this section with a theorem that gives several useful characteriza- 
tions of isomorphisms from a vector space to itself. 


Theorem 2.2.11. Let V be a finite dimensional vector space and let 
f:V—- VY be a linear transformation. The following conditions are 
equivalent: 


f is left invertible, that is, there is a function g: ¥V > V such that 
gy =i; 


(e) f is right invertible, that is, there is a function g : V —+ V such 
that fg = Id. 


Proof. Clearly (a) implies each of the remaining four conditions. Let {vj,..., 
vn} be a basis of V. 
Assume ker f = {O}. If 


rf (v1) ia trot (Vn) = 0, 


then 
f(tivi +++++2nVn) =0 


and consequently x1v1 + -::+2nVn = 0, since ker f = {0}. Hence x, = 
- = Zn = 0, because the vectors v1,...,Vn are linearly independent. This 
proves that the vectors f(v1),..., f(wn) are linearly independent. Consequently 
{f(v1),-..-,f(vn)} is a basis of V and we have ran f = V. This shows that (b) 
implies (c). 
If ran f = V, then Span{f(vi),..., f(vn)} = V and thus {f(vi),..., f(wn)} 
is a basis of V. Hence ker f = {0}, by the Rank-Nullity Theorem. This shows 
that (c) implies (b). 


86 Chapter 2: Linear Transformations 


Since (b) and (c) are equivalent and (b) and (c) together are equivalent to 
(a), all three conditions are equivalent. 

Now we show that (d) implies (b). Indeed, if there is function g : V > V 
such that gf =Id and f(x) = f(y), then 


x = 9(f(x)) =9(fly)) =y. 


This shows that f is injective and thus ker f = {O}. 

Finally we show that (e) implies (c). Assume there is a function g: V > V 
such that fg = Id. Then for every x € V we have x = f(g(x)) and thus 
ranf = V. O 


2.3 Linear transformations and matrices 


2.3.1 The matrix of a linear transformation 


At the beginning of this chapter we observed that an n x m matrix with entries 
in K defines a linear transformation from K™ to K”. It turns out that all linear 
transformations between finite dimensional spaces can be described in terms of 
matrix multiplication. We begin with an example. 


Example 2.3.1. Let {v1, v2, v3} and {w1, wo} be bases of V and W, respec- 
tively and let f : V + W be the linear transformation defined by 


f(vi) = a11wi + a21We, 


f (v2) = 212W1 + a22W2, 


f (v3) = 413W1 + A22W2. 
Show that for every v = 21v1 + ®2V2 + r3V3 € V we have 
f(v) =y1wi + y2We, 


where the numbers y; and ye are given by the equality 


Z1 

Y1 411 412 413 
— v2 

Yy2 421 422 423 
v3 


2.3. LINEAR TRANSFORMATIONS AND MATRICES 87 


Solution. Since 


f(v) = f(@1vi + f2v2 + £3V3) 
= 21 f(vi) + t2f(v2) + 3 f (vs) 
= £1(a11W1 + G21 We) + %2(a12W1 + a22W2) + £3(a13W1 + G22W2) 


Ty U1 
= [au a12 a3] v2 wit [a21 22 a3 r2 We; 
v3 v3 
we have 
1 
[au a12 ai3| v2 
x3 Ly 
E = os, | AO ea) x9 
Y2 G21 422 423 
Ly x3 
[a21 a22 a23| v2 
v3 


oO 


The observation in the above example can be generalized to an arbitrary 
linear transformation between finite dimensional spaces. 


Theorem 2.3.2. Let {vi,...,Vm} and {wi,...,wn} be bases of vector 
spaces V and W, respectively. For every linear transformation f :V— 
W there is a unique n X m matrix 


such that for every Vv = 21V1 +--+: +2mVm € V we have 
f(xivi alan + mVm) =Yywi +r + Yn Wn 
where the numbers y1,..-,Yn € K are given by 


Y1 G11 «+. Alm 


Yn 


Proof. For every 1 < j <‘m there are unique aj;,...,@nj € K such that 


f(v;) = ayjwi feeet anjWwn- 


88 Chapter 2: Linear Transformations 


If v=a41v1 +---+%nVn is an arbitrary vector in Y, then 


f(v) = f(aivi +--+ +2mVm) 
= x f(vi) +--+ +2mf(Vm) 
= 1(a11W1 apes <P Gn1Wn) i Lm(AimW1 ee QnmWn) 


Ty Ty 


[au tee Giga : wWwites t+ [ant tee dina : Wn- 


Im Lm 


Consequently, if f(aivi +++: +2@mVm) = yrWi +++: + YnWn, then 


XY 
He ligt 225 |, Ms 
Lm 
for all 1 < k <n, which is equivalent to 
Y1 G41 +++ Alm Ty 
Yn Ani +++ Anm Lm 


Definition 2.3.3. Let f : V > W bea linear transformation and let B = 
{vi,...,Vm} and C = {wi,...,w,} be bases of V and W, respectively. 
The matrix 


in Theorem 2.3.2 is called the matrix of f relative to the bases 
{vi,.--,Vm} and {wi,...,w,} and is denoted by fg+c. 


Example 2.3.4. Let f : ¥V — W be a linear transformation and let B = 
{vi,.--;Vm} and C = {wi,...,w,»} be bases of V and W, respectively. If 


Qi, «+. Gim 
A= fgsc = 


Qn1 --- AGnm 


is the matrix of f relative to the bases B and C, show that there is an isomor- 


2.3. LINEAR TRANSFORMATIONS AND MATRICES 89 


phism g : ker f + N(A) such that 


Ly 
g(@1V1 +--+ +2mVm) = 


Im 
whenever 21V1 +--+ +2%mVm € ker f. 
Solution. It suffices to observe that 21v1 +-::+2%mVm € ker f is equivalent to 


0 Qi1 «-» Gim Ly 


0 Ce ane ‘Ce 
by Theorem 2.3.2, and that the function h : ker f — K” defined by 


Ty 
h(aivi fees orvan) => 


tm 


is an isomorphism. Consequently, g : ker f + N(A) is an isomorphism. 
oO 


Example 2.3.5. Let V be a vector space with a basis {v1, V2, v3, v4} and let 
W be a vector space with a basis {w1, w2,w3}. Let f : V > W be a linear 
transformation such that the matrix of f relative to the bases {v1, v2, v3, v4} 
and {w1, W2, w3} is 


i 2 27 
ey Ak 3} 
4 5 7 7 


Find a basis of ker f. 


Solution. It is easy to verify that 


4 me 
1 0 
aah dees 


90 Chapter 2: Linear Transformations 


is a basis of 


i 2 Al 
N 2 i Bt oy 
4 5 7 7 
Consequently, 
{4v1 + v2 — 3v4, —7v1 + 3v3 4 v4} 
is a basis of ker f. | 
Example 2.3.6. Let {vi,...,Vm} and {wi,...,w,} be bases of vector spaces 
VY and W, respectively. If f : V — W is the linear transformation such that 
the matrix of f relative to the bases {v1,...,Vm} and {wi,...,w»y} is 


Qn1 -+- Gnm 


for alll <j<™m. 


Solution. We define g : ran f + C(A) by 


Ani anm 


To show that g is well-defined assume that f(ai1vi +--+: +%mVn) = f(vvi t+ 
--++a! vy). Then 


f(vivi +++ + 2mVn) = yrwi +++ + YnWn = f(zpvi +-++ + 2),Vn), 


2.3. LINEAR TRANSFORMATIONS AND MATRICES 91 


for some yi,..-,Yn € K. By Theorem 2.3.2, this is equivalent to 


lz a ea EB : 2 ; [es i | ell 
bee Ebel eee Ed 


Consequently, 
Q11 Aim Q11 aim 
/ 
Mia =F stacey = 2 ar aan ; 
Ani Anm Qnl1 anm 


proving that the function g is well-defined. 
Clearly, g is a linear transformation. If g(f(r1vi +---+2mvVn)) = 0, then 


0 = g(f(zivi + +++ +2mVn)) 
aii aim 
=%@,}] - | te +lm 
Qn1 Anm 
a1 aim Ly Y1 
Gal sas Onn Lm Un 


Hence 
f(xivi qe ee Linvin) = WI qp oo car YnWn = 0, 


proving that g is injective. 


Y1 
Finally, if | : | € C(A), then 
Un 
Y1 a1 Aim 
20 | sega se ap err 
Yn an aAnm 
for some %1,...,2%m € K and, consequently, 


Yn 


proving that g is surjective. 


92 Chapter 2: Linear Transformations 


Example 2.3.7. Let V be vector space with a basis {v1, v2, v3, v4} and let 
W be vector space with a basis {w1,w2,ws3}. If f : V ~ W is a linear 
transformation such that the matrix of f relative to the bases {v1, v2, v3, v4} 
and {w1, W2, w3} is 


! 2 2 1 
213 5], 
4 5 7 7 


find a basis of ran f. 


i 2 el 
Solution. Since the reduced row echelon form of the matrix | 2 1 3 5 
A eae 
1 0 4/3 3 
is | 0 1 1/3 —-1 |, the set 
0 0 O 0 
{wi + 2w2 + 4ws, 2w) + we + 5w3} 
is a basis of ran f. Oo 


Theorem 2.3.8. Let B,C, and D be bases of vector spaces V, W, and 
X, respectively, and let f: V+ W andg:W > & be linear transfor- 
mations. If A is the matrix of f relative to the bases B andC and B is 
the matrix of g relative to the bases C and D, then the matrix BA is the 
matrix of gf relative to the bases B and D. In other words 


(gf )b+pD = gc>DfB—ec- 


Proof. If B = {v1,...,Vm}, C ={wi,...,Wn}, D = {x1,...,Xp}, and 


Q11 «+» Gim bi4 oes bin 
fsa4c=A=]| : and gcosp =B=| : ae 


Ani...» Qnm bp1 +++ Opn 
where A € Mnxm(K) and B € M,x,(K), then 
ff (vj) = a1jwi +--+ GnjWn, 
for all 1 <j <™m, and 


g(wr) = bipXi tee + bpkXp; 


2.3. LINEAR TRANSFORMATIONS AND MATRICES 93 


for alll <k<n. Since 
9(f(vj)) = a1gg(wi) + +++ + ang g(Wn) 
= 13 (b11X1 + +++ + bpiXp) + +++ + Ang (binX1 + +++ + bpnXp) 
ag aij 
= [bur saa bin| : Xpteee+t [dpi ae Byrs| : x 


anj anj 


and the j-th column of the matrix 


bi toe bin Q11 «++ Atm 
bp1 sae bon Anil +++ Qnm 
is 
a15 
[but . bin| : 
anj bi : bin aij 
aij byt sae bon anj 
Dei new teal 
anj 
the matrix of gf relative to the bases GB and D is BA. oO 


Corollary 2.3.9. Let B and C be bases of a vector space VY. Then the 
matrix of the identity function Id: V + V relative to the bases B and C 


is invertible and its inverse is the matrix of the identity function relative 
to the bases C and B. 


Proof. This is an immediate consequence of Theorem 2.3.8 because 


Ide4gIdg4ce =Idg4g and Idg+c Ide_.g = Idec. 


Example 2.3.10. We consider the linear vector space M2y.2(K) and the bases 


es erate eee ately 


94 Chapter 2: Linear Transformations 


e={f fo d-6 41-0}. 


Determine the matrix of Id relative to the bases B and C. 


and 


Solution. Since the matrix of Id relative to the bases C and B is 


ee OF OS & 


1 
0 
0 
1 


jo 
re CO fF Fe 


the matrix of Id relative to the bases 6 and C is 


-1 


oe en) 0 oO 1 0 
1 10 0 0 1 -1 0 
1 0 0 0 a 1-1 OO 
111i —1 0) 0 1 
Oo 
If {vi,..-,Vn} is a basis of a vector space V and f : V > Y is a linear 


transformation, then there is a unique n x n matrix A such that for every 
V=27V, +-:-+2nVn € V we have 


f(tivi +--+ + 2nVn) = yiVi ++++ + YnVn 


where the numbers yj,...,Y%n € K are given by 
Y1 ty 
=A 
Un In 


This is simply a special case of Theorem 2.3.2. We will say that A is the matrix 
of the linear transformation f :V— V relative to the basis {v1,...,Vn}. 


Theorem 2.3.11. Let B and C be bases of a vector space V and let 
f:V—- Y be a linear transformation. Then the matrix of f relative to 
the basis C is 


M=P"'NP, 


where N is the matriz of the linear transformation f relative to the basis 
B and P is the matrix of the identity function Id: V > V relative to the 
bases C and B. 


2.3. LINEAR TRANSFORMATIONS AND MATRICES 95 


Proof. Since f = Id f Id and consequently 


Idce+8 fee ldg+c = (Idg+c)' fee ldpse = fcc, 


the result follows from Theorem 2.3.8 because P~! is the matrix of the identity 
function Id: V > VY relative to the bases B and C. O 


2.3.2 The isomorphism between M,,..,,(K) and L(V, W) 


The main result of this section is the fact that, if V and W are vector spaces such 
that dimYV = m and dimW = n, then the space of all linear transformations 
from V to W can be identified with the space of all nx m matrices. The following 
theorem formalizes this claim. 


Theorem 2.3.12. Let {vi,...,Vm} and {wi,...,Wn} be bases of vector 
spaces V and W, respectively. For every n xX m matrix 


we define the linear transformation f4:V—> W via 


fa(vj) = a1jwit-++ + angwWn 


for every j € {1,...,m}. Then the function A: Mnxm(K) > L(V, W) 
defined by 


is an isomorphism. 


Proof. We first show the A is a linear transformation. Let 


G41 «+» Aim bi see bim 


Ani +++ Anm Ont sae bam 


Since 


A(A + B)(vj) = (arg + b1j)wi + +++ + (ang + bnj)Wn 
= ajWi tess + AngWn + b13 wi + +++ + bngWn 


= A(A)(v;) + A(B)(v5), 


96 Chapter 2: Linear Transformations 


we have A(A + B) = A(A) + A(B). If a € K, then 


A(a@A)(vj) = (aa1j)wi +--+ + (Qanj)Wn 
= a(a1;W1 feet GnjWn) 


= (aA(A))(v5); 


so A(a@A) = aA(A). Consequently A is a linear transformation. 
If A(A) = A(B), then 


Q1jW1 + +++ + angWn = b17W1 +°++ + bnjgWn 


for all 1 <7 <m. Consequently A = B, proving that A is injective. 
Finally, if g : V — W is an arbitrary linear transformation and 


g(vj) = aigwi +--+ OnjWn 
for all 1 <j <m, then g = A(A) where 


Q1i1 --- Atm 


This shows that A is surjective. O 


Example 2.3.13. The function that assigns to the matrix 
AG [ax ake || 
the linear transformation f4 : K” — K defined by 


Beal 
fa =e tO, a 


Im 


is an isomorphism from the vector space Mjxm(K) to the vector space 
L(K™, KR). 


2.4 Duality 


In this section we study the vector space £(V,K), that is the space of all linear 
transformations from a vector space to the number field K. While £(V,K) is 
a special case of the vector space of all linear transformations between vector 
spaces, it has some distinct properties. 


2.4. DUALITY 97 


2.4.1 The dual space 


Definition 2.4.1. Let V be vector space. A linear transformation f : 


VY > K is called a functional or a linear form. The vector space L(V, K) 
is called the dual space of the vector space V and is denoted by V’. 


Example 2.4.2. Let V be an n-dimensional vector space and let f :V > K 
be a nonzero linear form. If a € V is such that f(a) 4 0, show that 


VY = Ka@ ker f 
and determine the projection of V on Ka along ker f. 
Solution. For every v € V we have 


1 (vf) ac 


and thus b = v — oeta € ker f. Since we can write 


_ fw) 
V= FE ae 
we have 
VY = Ka + ker f. 


Now we show that this sum is direct. If v € Kan ker f, then v = aa for 
some a € K and f(v) = 0. This yields f(aa) = af(a) = 0. Since f(a) 4 0, 
we have a = 0 and consequently v = 0. Therefore 


VY = Ka@ ker f 


and the projection of the vector v € VY on Ka along ker f is 


Note that, since dim Ka = 1 and dim Ka+dimker f = dim VY = n, we have 
dim ker f = n — 1, as shown in Corollary 2.1.31. 


Example 2.4.3. Let V be an n-dimensional vector space and let f,g € V’. If 
f is nonzero and ker f C ker g, show that g € Span{ f}. 


98 Chapter 2: Linear Transformations 


Solution. Let v € ker f. With the notation from Example 2.4.2, we have 


098) (a) = HA) (6 g(a) 


glaa + v) = ag(a) = a) SG Oa ace: 
Thus g = 42} because V = Ka + ker f. O 


Definition 2.4.4. Let {vi,...,vn} be a basis of a vector space V. For 
every j € {1,...,n} by ly, we mean the unique linear form ly, : V > K 
such that l,,(v;) = 1 and l,,(vz) = 0 for every k 4 j. In other words, 


Ly, (aiVi +++: + SnVn) = aj. 


Theorem 2.4.5. If {vi,...,Vn} is a basis of V, then {ly,,..., ly, } is a 
basis of V’. 


Proof. Assume aly, +++: +2nly, = 0. Since, for every 1 < k < n we have 
0 = aly, (ve) +--+ + 2nly, (ve) = Cely, (Ve) = Ce, 
the functions ly,,...,ly,, are linearly independent. 


If f : V > K is the linear transformation such that f(v;) = a; for every 
1<j <n, then it is easy to verify that 


f =aly, +++++4yly,. 


This shows that 
Span{ly,,l<j<n}=V’', 


completing the proof. oO 


Definition 2.4.6. Let {vi,...,Vvn} bea basis of the vector space V. The 


basis {ly,,...,ly,, } of V’ is called the dual basis of the basis {v1,..., Vn}. 


Example 2.4.7. Find the dual basis of the basis {1,t,...,¢"} in the space 
Pn(K). 


2.4. DUALITY 99 


Solution. According to the definition of the dual basis we have 


li(a9 + ait +--+ + ant”) =a, 
l:(a@o + ait +--+ + ant”) = a1, 
lin (Qo + ait +--+ + ant”) = an. 


Note that for any p € Pp(K) we could write 


hn(P) = P(0), be(p) = 90), tee) = 5"(0), «5 ten 9) = (0). 


This formulation has the advantage that we don’t have to write p in the form 
ag +ait+-+++a,t". For example, if p(t) = ((#? +¢+1)?+¢+3)’, it would 
be quite time consuming to calculate /;(p) using the first formula. Calculating 
it using I;(p) = p’(0) is much simpler. 


Theorem 2.4.8. Let V be a vector space such that dimV = n. If 
V1,.-., Vj; are linearly independent vectors in V, then the set 


O={feV': f(vi) =--- = f(v;) =0} 


is a vector subspace of V’ and dim Q =n — j. 


Proof. First we extend {vi,...,v,;} to a basis {v1,...,vn} of V. Let {l,,..., 
ly, } be its dual basis. It is easy to see that Q is a subspace of V’. 

Now we show that {ly,,,,-..,/v, } is a basis of Q. The linear functionals 
lyjiis-++54y, are in Q and are linearly independent, so we only have to show 
that Span{l,,,,,--.,/v, } = Q. If f € Q, then we can write 


f=ah, +--+ 2nly,, 


where £1,...,2%, are numbers from K. Since, for every 1 < k < j, we have 
0 = f(ve) = Xe, we conclude that 


f = Ujtily; sy Se oe 


2.4.2 The bidual 


For any vector space V the dual space V’ is a vector space, so it makes sense to 
consider its dual, that is, (V’)’. 


100 Chapter 2: Linear Transformations 


Definition 2.4.9. Let V be vector space. The vector space (V’)’ is called 


the bidual of VY. 


Example 2.4.10. Let V be a vector space and let v € V. It’s easy to verify 
that the function gy : VY’ > K defined by g,(/) = I(v) is an element of the 
bidual, that is, a linear form on V’. 


In this section we will show that, if V is finite dimensional, then the spaces 
V and (V’)’ are isomorphic (see Theorem 2.4.12). First we need to prove an 
auxiliary result. 


Lemma 2.4.11. Let V be a vector space of finite dimension and let 


vey. Ifl(v) =0 for everyl € V', then v =0. 


Proof. Let {vi,...,Vn} be a basis of V and let {ly,,...,ly,,} be the dual basis. 
Ifv € VY, then v = xv, +--+ +2nV,y for some 71,...,%, € K. Since, for every 
1<j <n, we have 0=1,,(v) = 2;, which means that v = 0. oO 


Now we prove the main result of this section. 


Theorem 2.4.12. Let V be a finite dimensional vector space. The func- 
tionT : V + (V’)', which associates with every vector v € V the linear 


form gy : V’ + K defined by gy(l) = l(v), is an isomorphism from V to 
(Vy. 


Proof. For any v,v1,V2 © V, a € K, andl € VY’, we have 
Qvitva(l) = U(vi + v2) = Uvi) + U(v2) = gvi(l) + va (l) 


and 


Jov (1) = Wav) = al(v) = agy(l). 


This shows that I is a linear transformation. If gy(/) = I(v) = 0 for every 
le V’, then v = 0 by Lemma 2.4.11. Consequently, I is injective and thus 
dimranT = dim Y. 

Finally, since dimV = dimV’ = dim(V’)’, we have ranT’ = (V’)! because 
ranT is a subspace of (V’)’ such that dimranT = dim(V’)’. oO 


2.4. DUALITY 101 


The linear form gy : VY’ + K defined by g,(l) = I(v) is called the canonical 
isomorphism from VY to (V’)’. 
From Theorem 2.4.12 it follows that every basis of VY’ is the dual basis of 


some basis of VY. 


Theorem 2.4.13. Let V be vector space and let {fi,..., fn} be a basis 


of V’. Then there is a basis {v1,...,Vn} of V such that {fi,..., fn} ts 
its dual basis. 


Proof. Let V be an n-dimensional vector space. Let {f1,..., fn} be a basis of 
V’ and let {lp,,...,ly,} be its dual basis in (V’)’. By Theorem 2.4.12, there 
exist vectors V1,...,Vn € V such that P(v;) = ly, for 7 € {1,...,n}. Since 
{ly,,.--, ly, } is a basis of (V’)! and T is an isomorphism, {vj,...,Vn} is a basis 
of V. The set {fi,..., fn} is the dual basis of {v1,...,vn} because for every 
j € {1,...,n} we have 


fy(vs) =T (ws) (fj) = Uy, (fa) = 1 
and 

Fi(ve) =T (ve) fj) = lp, (5) = 0 
whenever j # k. q 


Note the similarity between the next theorem and Theorem 2.4.8. We could 
say that Theorem 2.4.14 is a “dual version” of Theorem 2.4.8. The proofs of 
these two theorems are also similar. 


Theorem 2.4.14. Let V be a vector space such that dimV = n. If 
fi,..-, fj) €V' are linearly independent, then the set 


U={veV: filv) =---= fj(v) = 0} 


is a vector subspace of V and dimUuU =n — j. 


Proof. First we extend {fi,..., f;} to a basis {fi,..., fn} of V’. Let {vi,..., 
vn} be a basis of V such that {f1,..., fn} is its dual basis. It is easy to see that 
U is a subspace of VY. 


Now we show that {vj+1,...,Vn} is a basis of U. The vectors vj41,-.-, Vn 
are linearly independent, so we only have to show that Span{vj+1,...,Vn} =U. 
If u €U, then 


U= 71V1 +++: +FnVn 


where 2%1,...,2%n are numbers from K. 


102 Chapter 2: Linear Transformations 


Since, for every 1 << k <j, we have 0 = fx(u) = xz, we conclude that 


U >= Tj41V541 +++++2nVn. 


2.5 Quotient spaces 


For a vector space V and its subspace U there is a subspace W such that V =U@ 
W. While the space W is not unique, we can show that, ifUGW, = UGW», then 
the spaces Wy and W) are isomorphic. In this section we present a canonical 
way of constructing, for a given vector space V and its subspace U/, a space that 
is isomorphic to every space W such that V=U @ W. 

If U is a subspace of a vector space V, then we define 


x+U={x+u:ucl}. 
In this section we will use the following notation 
X=x+H, 
which is a generalization of what was introduced in Example 1.1.8. This notation 


makes sense only if it is clear what the subspace U is. Note that, while x is a 
vector, X is a set of vectors. In particular, we have x =x +0€ x. 


Lemma 2.5.1. Let x andy be vectors in a vector space V and let U be 


a subspace of V. Then X = ¥ if and only if y=x-+u for someu cu. 


Proof. First assume that y = x. Then 
yey=x=x4+U. 


Consequently, y = x +u for some u EU. 
Now assume that y = x+u for some u €U. Then 


Y=y+U=(x+uj+U=x4 (ut+ YU) =x4+U =X. 


Corollary 2.5.2. Let V be a vector space and let U be a subspace of V. 
IfxeV anducl, then 


eee 


xX=x+u. 


2.5. QUOTIENT SPACES 103 


Theorem 2.5.3. Let V be a vector space and let U be a subspace of V. 
IfEO¥ #0, then 


Proof. If x ¥ # , then there are vectors u,, U2 € U/ such that x+u, = y+ U2 
and we have 
Note that the above implies that % = 0 if and only if x EU. 


Definition 2.5.4. Let V be a vector space and let U be a subspace of 
Vv. The set 


V/U={x+Uu:xeV} 


is called the quotient space of V by U. 


In other words, but less precisely, V/U/ is the set of all X’s with x € V. The 
quotient space V/U becomes a vector space if we define 


E+V=x+y and a®¥=dx 
for any x,y € V anda € K. It is easy to verify that these operations are 
well-defined. Note that 0 = U/ is the zero vector in V/U. It is important that 
the operations in V/U/ are defined in such a way that the function g: V > V/U 
defined by q(x) = X is a linear transformation. 


Definition 2.5.5. Let U be a subspace of a vector space Y. The function 
q:V¥—>V/U defined by 


q(x) =X 


is called the quotient linear transformation. 


The following example will be generalized in Exercise 2.75. 


Example 2.5.6. Show that dimV/U = 1 if and only if U @ Kv = Y for some 
vey. 


Solution. Let ¢q : V > V/U be the quotient linear transformation and let 
{a(v)} = {¥} be a basis for V/U. Note that because Vv 4 0 = U/ the vector v 
is not in U/. Then for every x € VY there is a € K such that X = aV = av, and 


104 Chapter 2: Linear Transformations 


thus 
x=av+u, 


for some u € U. Consequently, V = U + Kv and the sum U + Kv is direct, 
because v ¢ U. 

Conversely, if U @ Kv = VY, then every vector x € VY can be written as 
x = u-+av for some a € K. Consequently, X = @v = a¥ and thus {¥} is a 
basis for V/U. oO 


Example 2.5.7. Let v,u,w be linearly independent vectors in R*. The set 
{u+ Rv + Rw} is a basis of R?/(Rv + Rw). 


Theorem 2.5.8. LetU be a subspace of a finite dimensional vector space 
Vv. Then 


dim V = dimU + dim V/U. 


Proof. This result can be obtained from the Rank-Nullity Theorem 2.1.28. In- 
deed, U = kerg and V/U = rang where q: V > V/U is the quotient linear 
transformation (see Definition 2.5.5). Oo 


Theorem 2.5.9. Let V and W be vector spaces and let f : V + W be 
a linear transformation. There is an isomorphism g : V/ker f > ran f 
such that 


f(x) = ga(x) = g9(X), 


where x is a vector from V andq:V—> V/kerf is the quotient linear 
transformation. 


Proof. For x € V we define g(X) = f(x). Note that g is a well-defined function 
g: V/ker f — ranf. Indeed, if X = x+y for some y € ker f, then we have 


g(X+y¥) = f(x +y) = f(x) because f(y) = 0. 
Since for every x1, x2 € V we have 


g(B1 + Ro) = g(Xi +X2) = f(K1 +2) = f(x1) + f(K2) = g(Ki) + 9K) 
and for every x € V and a € K we have 
g(aX) = g(ax) = f(ax) = af(x) = ag(X), 


g is a linear transformation. 


2.6. EXERCISES 105 


If g(&) = f(x) = 0, then x € ker f and thus X = 0. Consequently, g is 
injective. 

Finally, if y € ran f, then the there is x € V such that f(x) = y. Conse- 
quently, g(x) = y and thus g is surjective. oO 


Corollary 2.5.10. IfU and W are subspaces of a vector space V such 


that V=U@W, then the spaces V/U and W are isomorphic. 


Proof. If f :U@W — W is the projection on W along U, then ker f =U and 
ran f = W. Therefore the result is a consequence of Theorem 2.5.9. oO 


Note that if the vector space V is finite dimensional we can get Theorem 
2.5.8 as a consequence of Corollary 2.5.10. 


Example 2.5.11. Let R? = Ru@ Rv 6 Rw. If f : R? — Rv & Rw is the 
projection on Rv @ Rw along Ru, then the function g : R?/Ru — Rv 6 Rw 
defined by 


g(q(av + Bw + yu)) = av + Bw, 


where q : R? + R/Ru is the quotient linear transformation, is an isomorphism. 


2.6 Exercises 


2.6.1 Basic properties 


Exercise 2.1. Let V be a vector space and let / be asubspace of V. If f: VV 
and f(U) CU, then we say that U/ is f-invariant. 

Let f,g : ¥V — V be linear transformations. If U is an f-invariant and a 
g-invariant subspace, show that U/ is a (gf)-invariant subspace. 


Exercise 2.2. Let V be a vector space and let f : V — V be a linear trans- 
formation. If U/ is an f-invariant subspace (see Exercise 2.1), show that the 
restriction of f to Y is a linear transformation fy :U — U. 


Exercise 2.3. Let V be a vector space and f : V > V bea linear transformation. 
We suppose that U/ and W are f-invariant subspaces (see Exercise 2.1). Show 
that UN W is an f-invariant subspace. 


Exercise 2.4. Let V be a vector space and f : V > V be a linear transformation. 
We suppose that U/ and W are f-invariant subspaces (see Exercise 2.1). Show 
that U + W is an f-invariant subspace. 


106 Chapter 2: Linear Transformations 


Exercise 2.5. Let V and W be vector spaces, let U/ be a subspace of VY, and let 
g:UuU > W be a linear transformation. If V is a finite dimensional space, show 
that there is a linear transformation f : ¥V + W such that the restriction of f 
to U is g. 


Exercise 2.6. Let f : V > W bea linear transformation and let {wi,...,Wn} 
be a basis of ran f. If w; = f(u;) for every j € {1,...,n} andsome uj,...,Un € 
Y, then V = Span{uy,...,un} @ker f. 


Exercise 2.7. If f : ¥V > K is a nonzero linear transformation, then there is 
a vector u € Y such that f(u) = 1 and V = Span{u} @ ker f. Determine the 
projection on ker f along Span{u}. 


Exercise 2.8. Let V and W be vector spaces. For arbitrary w € W and a 
linear transformation f : V — K we define a function w @ f : V — W by 
(w @ f)(v) = f(v)w. Show that, if g : V ~ W is a linear transformation 
such that dimrang = 1, then there exist w € W and a linear transformation 
f:VY¥— Ksuch that g=w® f. 


Exercise 2.9. Let V be a vector space and let vi,...,Vn € VY. We define a 
function f : K” > VY by 


f : = 2Vi +++ + 4nVn 


In 
Show that ker f 4 0 if and only if the vectors vi,...,V» are linearly dependent. 


Exercise 2.10. Let V be a vector space and let v be a nonzero vector in V. If 
there is a vector subspace U C VY such that U 6 Kv = VY, show that there is a 
linear transformation f : ¥V + K such that ker f =U. 


Exercise 2.11. Let V be a vector space and f : V > K be a nonzero linear 
transformation. Show that there is a nonzero vector v € V such that ker f 6 
Kv =Y. 


Exercise 2.12. Let V be a vector space and let f,g : ¥V — K be nonzero linear 
transformations such that ker f = kerg. Show that there is a nonzero number 
a € K such that g = af. 


Exercise 2.13. Let V and W be vector spaces and let U4;,...,Un be subspaces 
of V such that V =U, @--- @U,. If fr : Ui, > W,..., fn: Un  W are linear 
transformations, show that there is an unique linear transformation g : V — W 
such that g(u;) = f;(u,;) for every 7 € {1,...,n} and u; €U;. 


Exercise 2.14. Let V, W, andU be vector spaces and let Vj C V and W; C W 
be subspaces. If V =U @V, and dim Y, = dim W, = n for some n > 1, show 
that there is a linear transformation f € L(V,W) such that ker f = YU and 
Fi) =W1. 


2.6. EXERCISES 107 


Exercise 2.15. Let V and W be vector spaces and let f : ¥V > W be a linear 
transformation. Show that the set {(v, f(v))|v € V} is a vector subspace of 
VxW. 


Exercise 2.16. If Y and W are finite dimensional vector spaces, show that there 
is an injective linear transformation f : V > W if and only if dimV < dimW. 


Exercise 2.17. Show that the function f : Dg(R) + F(R) defined by f(y) = 
y + 2y’ is linear and determine ker f. 


Exercise 2.18. Show that the function f : D2(R) + Fig(R) defined by f(y) = 
y +" is linear and determine ker f 


Exercise 2.19. Find a basis of ker f for the linear transformation f : M2 x2(K) 


— K defined by 
a b 
i{[0 ])-orssera 


Exercise 2.20. Let U/ be the subspace of R® defined by 


z 
U=<¢ ly| CR?:2+y+z2=0 
2 


and let f : U4 — R? be the linear transformation such that 


1 = 
f “1 -|3 md FA do al 


Determine f. 


Exercise 2.21. Let V and W be finite dimensional vector spaces and let f : 
VY +> W be a linear transformation. If dimW =n, show that for 7 € {1,...,n} 
there are vectors w; € W and linear transformations f; : V — K such that 
f=wi® fit-:-+Wwn® fn, where w; ® f; is defined as in Exercise 2.8. 


Exercise 2.22. Let V and W be vector spaces and let f : ¥V — W be a linear 
transformation. Show that, if uj,...,u, € V are linearly independent vectors 
such that V = Span{u,...,un} @ kerf, then {f(uj),..., f(un)} is a basis of 
ran f. 


Exercise 2.23. Let V and W be finite dimensional vector spaces and let f € 
L£(V,W). Use the rank-nullity theorem to show that, if f is surjective, then 
dim Y > dimW. 


Exercise 2.24. Let V and W be finite dimensional vector spaces. Explain the 
meaning of the rank-nullity theorem for the function f : V x W— VY defined by 


f(v,w) =v. 


108 Chapter 2: Linear Transformations 


2.6.2 Isomorphisms 


Exercise 2.25. Let f : V > W be a linear transformation and let {wi,...,Wn} 
be a basis of ran f. Then there are uj,...,un € V such that f(u;) = w; for ev- 
ery j € {1,...,n}. Consider the linear transformation g : Span{uj,...,un}— 


ran f defined by g(u) = f(u) for every u € Span{uj,...,up}. Using Exer- 
cise 2.6, show that g is an isomorphism such that for every v € V we have 
g(h(v)) = f(v), where h: V > V is the projection on Span{uy,...,u,} along 
ker f. 


Exercise 2.26. Let f : P4(R) — P4(R) be the linear transformation defined 
by f(p) =p”. Find n, wi,...,Wn,Ui,---,Un, and h that satisfy the conditions 
in Exercise 2.25. 


Exercise 2.27. Let V,W 1, Wz be arbitrary vector spaces. Show that the vector 
space L(V, W, x W2) is isomorphic to the vector space L(V, W1) x L(V, W2). 


Exercise 2.28. Let V be a vector space and let U, W 1, and W2 be subspaces of 
VY. If V=UGW, =U G Wy, show that there is an isomorphism g : Wy — W3. 


Exercise 2.29. Show that there is an isomorphism f : K* + K* such that 


f?(x) = -x. 


Exercise 2.30. Let V be a finite dimensional vector space and let f,g € L(V). 
Show that the operator fg is invertible if and only if both f and g are invertible. 


Exercise 2.31. Let V and W be vector spaces and let {vi,...,vn} be a 
basis of V. Show that the function g : L(V,W) — W” defined by g(f) = 
(f(vi),---;f(Wn)) is an isomorphism. 


Exercise 2.32. Let V and W be vector spaces and let {vi,...,Vvn} be a basis 
of V. If W is finite dimensional, show that the set S = {f € L(V, W): f(vi) = 
f (v2) = 0} is a vector subspace of L(V, W) and find dimS. 


Exercise 2.33. Let V and W be vector spaces. Show that there is an isomor- 
phism between V x W and W x V. 


Exercise 2.34. Let V and W be finite dimensional vector spaces and let f : 
Y — W be a linear transformation. Show that the spaces V and ker f x ran f 
are isomorphic. 


Exercise 2.35. Let U, V, and W be vector spaces. Show that, if / and V are 
isomorphic and VY and W are isomorphic, then U/ and W are isomorphic. 


Exercise 2.36. Let V be a vector space and let f : V > V be a linear operator 
such that f? =0. Show that Id—f is an isomorphism. 


Exercise 2.37. Let V and W be vector spaces and let f : V ~ W be an 
isomorphism. Show that y : L(V) > L(W) defined by v(g) = fgf~! is an 
isomorphism. 


Exercise 2.38. Show that the function f : Mimxn(K) ~ Mnxm(K) defined 
by f(A) = A? is an isomorphism. 


2.6. EXERCISES 109 


2.6.3 Linear transformations and matrices 


Exercise 2.39. Let V be a vector space such that dim V = 4 and let f: VV 
be an operator such that dimran f = 2. Show that there are bases B and C of 
Y such that 


ooo o 


1 00 
0 00 
fasc = | 10 
0 0 0 


Exercise 2.40. We consider the linear transformation f : M2x2(K) > Moe x2(K) 
defined by f(X) = $(X + X7). Determine the matrix of f relative to the basis 


e={fod-bal-fel-[ a} 


Exercise 2.41. Let U/ = Span{cost,tcost,sint,tsint} and let f :U > U be 
the operator f(y) = y’. Show that B = {cost,t cost, sint,t sin t} is a basis of U 
and determine the B-matrix of f. 


Exercise 2.42. Let U¢ = Span{cost,tcost,sint,tsint} and let f :U > U 
be the operator f(y) = y”. Determine the matrix of f relative to the basis 
{cost,t cost, sint, tsint}. 


Exercise 2.43. Let V = Span{vi, v2} = Span{wi,w2} where {vj,v2} and 
{wi, w2} are bases. Let f : V > V be defined by f(v1) = 5vi + 7v2 and 
f(v2) = 2v1 + 3vq. If vi = 2wy — we and vg = 5w, 4+ 4wo, determine the 
matrix of f relative to the basis {w1, w2}. 


2.6.4 Duality 
Exercise 2.44. We define fi, fo, fg € (IR°)’ by 


x x x 
fil |y| | =2e+yt+z, fol jy] | =2+2yt+z, and fs | jy) | =2t+y+2z. 
vA az vA 


Show that fi, fo, and fs are linearly independent. 


Exercise 2.45. Let V be a vector space and let {vi,...,Vn} be a basis of V. 
Show that f = f(vi)ly, +-::+f(vn)ly, for every f € V’. 


Exercise 2.46. Let V be an n-dimensional vector space and let {fi,..., fn} 
be a basis of V’. Show that the function g : V > K” defined by g(v) = 
(fi(v),.--, fn(v)) is an isomorphism. 


Exercise 2.47. Let V be a finite dimensional vector space, let U/ be a subspace 
of V, and let x € V be such that x ¢ U. Show that there is a linear form f € V’ 
such that f(x) £0 and f(u) = 0 for every uc U. 


110 Chapter 2: Linear Transformations 


Exercise 2.48. Let V be a finite dimensional vector space and let f,g1..-,9n € 
vy’. If f(x) = 0 for every x € kergi N---M kergn, show that f is a linear 
combination of g1,...,9n.- 


Exercise 2.49. Let f : V —~ W be a linear transformation. We define the 
function f? : W’ > V’ by f7(1)(v) =l(f(v)) for 1 € W’ and v € V. Show that 
the function f7 is a linear transformation. 


Exercise 2.50. Let V and W be vector spaces and let By = {vi,...,Vn} and 
Bw = {wi,...,Wm} be a bases of V and W, respectively. Let f : V > W be 
a linear transformation and let A be the matrix of f relative to the bases By 
and By. Show that, if f" : W’ > VY’ is the linear transformation defined in 
Exercise 2.49, then the matrix of f7 relative to the dual bases {lw,,.--,lw,, } 
Sd cenjle. bis Ae 


Exercise 2.51. Let f: V > W andg: W > & be linear transformations. 
Show that (gf)? = f7g?. 


Exercise 2.52. Let VY and W be vector spaces and let f € L(V,W). Show 
that, if f is an isomorphism, then f’ is an isomorphism. 


Exercise 2.53. Let V and W be finite dimensional vector spaces and let f € 
L(V,W). Let G: (V’)’ > (W’)’ be defined by G(F)(1) = F(If) for F € (V’)’ 
andle W’. IfS:V— (V’)' and T : W > (W’)’ be the canonical isomorphisms, 
show that, if S(v) = F for some v € Y, then T(f(v)) = G(F). 


Exercise 2.54. Let V and W be finite dimensional vector spaces and let f € 
L(V,W). Show that there is a unique linear transformation g € £(V,W) such 
that I(g(v)) = f7(l)(v) for every 1€ W! and ve V. 


Exercise 2.55. Let U be a subspace of a finite dimensional vector space V. 
Show that the set U° = {1 € V’ : I(u) = 0 for every u € UV} is a subspace of V’ 
and that dim? + dimU = dim V. 


Exercise 2.56. Let V and W be finite dimensional vector spaces. Show that 
the function f : £(V,W) > L(W’, V’) defined by f(g) = g? is an isomorphism. 


Exercise 2.57. Let V1, V2, and W be vector spaces. Show that the vector 
space L(V; x V2, W) is isomorphic to the vector space £L(V1, W) x L(V2, W). 


Exercise 2.58. Let V and W be vector spaces and let f : V ~ W bea 
linear transformation. Show that (ran f)° = ker f, where (ran f)° is defined 
in Exercise 2.55 and f? is as in Exercise 2.49. 


Exercise 2.59. Let V and W be finite dimensional vector spaces and let f : 
VY — W be a linear transformation. Using Exercise 2.58 show that f is surjective 
if and only if f? is injective. 


Exercise 2.60. Let U/ be a subspace of a vector space V and let U° be as 
defined in Exercise 2.55. If /°° = {x € V : I(x) = 0 for every | € U°}, show 
that U°° =U. 


2.6. EXERCISES 111 


Exercise 2.61. Let V and W be finite dimensional vector spaces, f € L(V,W), 
and1€ V’. If 1 € (ker f)°, show that there is m € W’ such that 1 = mf. 


Exercise 2.62. Let V and W be finite dimensional vector spaces and let f € 
L(V,W). Show that (ker f)° = ran f”, where (ker f)° is defined in Exercise 
2.55 and f” is as in Exercise 2.49. 


Exercise 2.63. Let V and W be finite dimensional vector spaces and let f : 
VY — W be a linear transformation. Using Exercises 2.55 and 2.58 show that 
dimran f = dimran f’. 


Exercise 2.64. Use Exercise 2.63 to show that f is injective if and only if f7 
is surjective. 


Exercise 2.65 (Rank Theorem). Use Exercises 2.50 and 2.63 to show that if 
A€ Mnxm(K), then dim C(A) = dim C(A’). 


Exercise 2.66. Let V and W be vector spaces with bases {v1,...,Vm} and 
{wi,...,Wn}, respectively. Show that the set of all linear transformations w, ® 
ly,, where 1 <j <mand1<k <n, isa basis of L(V,W). (See Exercise 2.8 
for the definition of wz ® ly,.) 


Exercise 2.67. Let V and W be vector spaces with bases {vi,...,Vm} and 
{wi,...,Wn}, respectively. For every (j,&) € {1,...,m} x {1,...,n} we define 
the linear transformation fj, : V — W by 

wy, ifi=J, 

0 ifiFj. 


Show that the set {fjx : (9,k) € {1,...,m} x {1,...,n}} is a basis of L(V, W). 


fir(vi) = 


Exercise 2.68. Let V be a finite dimensional vector space and let f, f,..., f; € 
y’. If f € Span{fi,...,f;}, then there is w € V such that f(w) 4 0 and 
f(w) = 0 for every k € {1,..., 7}. 


2.6.5 Quotient spaces 


Exercise 2.69. Let V and W be vector spaces and let f : V + W bea 
linear transformation. Show that the function f : V/ker f > ran f defined by 


n~ 


f(v + ker f) = f(v) is an isomorphism. 


Exercise 2.70. Let R? = Ru@RvORw. If f : R? — Rv@Rw is the projection 
on Rv & Rw along Ru, then the function g : R?/Ru > Rv & Rw defined by 


g((av + Bw + yu)*) = av + bw, 
is an isomorphism. 


Exercise 2.71. Let V be a vector space and let f : V — K be a nonzero linear 
transformation. Show that there is an isomorphism between V/ ker f and K. 


112 Chapter 2: Linear Transformations 


Exercise 2.72. Let V and W be vector spaces and let f : ¥V — W be a linear 
transformation. Show that the spaces V and ker f x V/ker f are isomorphic. 


Exercise 2.73. Let V and W be finite dimensional vector spaces. If V; C V 
and W,; C W are subspaces, show that the spaces (V x W)/(V, x Wy ) and 
Vv/Vi x W/W, are isomorphic. 


Exercise 2.74. Let V and W be vector spaces and let U a subspace of V. 
If f : ¥V — W is a linear transformation such that U C ker f, show that the 
function g : V/U — W defined by g(q(x)) = f(x), where gq: V > V/U is the 
quotient linear transformation, is a well-defined linear transformation. 


Exercise 2.75. Let U be a subspace of a vector space V. Show that dim V/U = 
n if and only if there are linearly independent vectors vi,...,vn € V such that 
V=U6 Kv, ::-G Kvy. 


Exercise 2.76. Let V be a vector space and let U/ be a subspace of V. If 
f:VY— V isa linear transformation and U is f-invariant, then there is a unique 
linear transformation g: V/U — V/U such that qf = gq, where gq: V > V/U is 
the quotient linear transformation. 


Exercise 2.77. If U¢ and W are subspaces of a vector space V such that V 
U@W, show that the linear transformation h : W — V/U defined by h(w) = 
is an isomorphism, without using Theorem 2.5.9. 


Ww 


Exercise 2.78. Let v, u, and w be linearly independent vectors in R*. Show 
that the set {u+Rv, w+Rv} is a basis of R?/Rv. 


Exercise 2.79. Let U and W be subspaces of a vector space V such that 
YV=U EW. If {wi,..., wr} is a basis of W, show that {wi +U,...,w, +U} 
is a basis of V/U. 


Chapter 3 


Inner Product Spaces 


Introduction 


The dot product is an important tool in the linear algebra of Euclidean spaces 
as well as many applications. In this chapter we investigate properties of vector 
spaces where an abstract form of the dot product is available. In the context of 
general vector spaces the name inner product is used instead of dot product. 


In some examples and exercises in this chapter we will use determinants. In 


particular, we will use the fact that a matrix ° | € Mo x2(K) is invertible if 


and only if 
det i | ee yiuce Barua, 


The use of determinants in those examples and exercises is not essential, but 
it is convenient and it leads to simplifications. Unlike some other textbooks at 
the same level, we do not consider determinants a forbidden tool. 


3.1 Definitions and examples 


In Chapter 2 we used the name linear form to mean a linear function f : V > K. 
In this chapter we consider functions f : V x V > K. Since VY x VY is a vector 
space, we can talk about linearity of f: Vx V— K: 


f(ai(%1, y1) + @2(X2, y2)) = a1 f(X1,y1) + Gof (x2, y2). 


In the context of inner product spaces it is natural to consider a different prop- 
erty of functions f : V x VY —> K related to linearity, namely bilinearity. 


113 


114 Chapter 3: Inner Product Spaces 


Definition 3.1.1. By a bilinear form on a vector space VY we mean a 
function f : V x YV > K such that 


XY + Xo, y)= f(x1,y) + f(x,y), 


(a 
( x yity2)= f(x,y1) + f(%, y2), 


b 


vat 
7 
f(ax,y) = af(x,y), 
f( 


) 
) 
(c) 
(d) 


x, ay) =af(x,y), 


for all vectors x, x1, 2,Y,Y1, y2 € V and all numbers a € K. 


Note that the conditions (a)-(d) in the above definition can be expressed as 
a single equality: 


f(a1x1 + a2X2, bry1 + b2y2) 
= aybi f (x1, y1) + a2bi f (x2, y1) + a1b2f (x1, y2) + a2bef (x2, y2). 


Clearly, this condition implies 


S © a5x;, 9 _ dee =e bef ( Xj, Yk): 
j=l = 


k=1 j=l k=1 


The conditions for linearity and bilinearity of a function f : V x YV > K are 
not equivalent. Both linear and bilinear functions satisfy conditions (a) and (b), 
but for a linear f we have f(ax,ay) = af(x,y) and for a bilinear f we have 
f (ax, ay) = a? f(x,y). 


Example 3.1.2. If f and g are linear forms on a vector space V, then the 
function 


h(x, y) = f(x) + 9(y) 
is a linear form on Y x Y and the function 


k(x, y) = f(x)g(y) 


is a bilinear form on VY. 


If K = C, the field of complex numbers, then there are reasons to replace con- 
dition (d) in the definition of bilinearity with the condition f(x, ay) = @f (x,y), 
where @ denotes the complex conjugate of a. 


3.1. DEFINITIONS AND EXAMPLES 115 


Definition 3.1.3. By a sesquilinear form ona vector space V we mean 
a function s: VY x V + K such that 


1Y) = as(x, y), 


ay) = a(x, y), 


for all vectors x, X1,X2, Y,Y1,Y2 € V and all numbers a € K. 


As in the case of bilinear form, the conditions (a)-(d) in the above definition 
can be expressed as a single equality: 


f(a1xX1 + a2X2, bry1 + bey2) 
= arbi f (x1, y1) + aebi f (x2, y1) + abe f (x1, y2) + aebef (x2, yo). 


In general, we have 


S © a5x;, 5 > deve = ay bi f ( Xj, Yk): 
j=l 


k=1 j=l k=1 


Note that for a function f : V x VY — R the conditions for bilinearity and 
sesquilinearity are equivalent. 


Example 3.1.4. The function 
aed eae ee eae 
s u2| , | V2 = 3u,0, + V2usB5 + 5303 


is a sesquilinear form on the the vector space C? over C. 


Definition 3.1.5. A form f : VxV > Kis called symmetric, if f(x,y) = 


f(y, x) for allx,y € V. 


The functions h and & in Example 3.1.2 are symmetric if and only if f(x) = 
g(x) for allx € V. Note that, if f : YV x V > K is symmetric, then (b) in 
Definition 3.1.1 follows from (a). Similarly, (d) follows from (c). 


116 Chapter 3: Inner Product Spaces 


The following theorem is an easy consequence of the definition of symmetric 
bilinear forms. 


Theorem 3.1.6. Let V be a vector space over R and lets: VxV—74R 
be a symmetric bilinear form. Then 


s(x,y) = — |[s(x+y,x+y) —s(x-y,x-y)] 


for allx,y € V. 


The identity in the above theorem is often referred to as a polarization iden- 
tity. It implies that, if the values of s(x,x) are known for all x € V, then the 
values of s(x, y) are known for all x,y € V, which is often used in arguments. 


Corollary 3.1.7. If s1 and sq are symmetric bilinear forms on a real 
vector space V such that 


$1(X, X) = $2(x,x) forallx EV, 


then 
$1(X,y) = S2(x,y) forallx,y EV. 


In particular, if s is a symmetric bilinear form such that s(x,x) = 0 for 
every x € V, then s =0. 


Note that the above property is not true for all bilinear forms. Indeed, for 
the bilinear form s : R? x R? > R defined by 


cxn-o((3}.]) 2a 


we have s(x,x) = 0 for every x € R?, but it is not true that s(x,y) = 0 for 
every x,y € R?. 


It turns out that for sesquilinear forms a different condition is more natural 
than symmetry. 


Definition 3.1.8. A sesquilinear form s: V x V > K is called a hermi- 
tian form if 


s(x, y) = s(y,x) 


for all x,y € V. 


3.1. DEFINITIONS AND EXAMPLES 117 


Example 3.1.9. The function 


U1 U1 
s u2|, | ve = 3u,07 + V2ueta + =ugts 
U3 U3 


is not. 


As in the case of symmetric forms, if s is a hermitian form, then the condition 
(b) in Definition 3.1.3 follows from (a) and (d) follows from (c). 


Theorem 3.1.10. Let V be a vector space over C and lets: Vx V > C be 


a sesquilinear form on V. Then s is hermitian if and only if s(x,x) € R 
for everyx€ VY. 


Proof. If s is hermitian, then for every x € V we have s(x, x) = s(x,x) which 
means that s(x, x) € R. 
Suppose now that s(v,v) € R for every v € V. Then 
a = s(x, y) + s(y,x) = s(x+y,x+y) — s(x,x) +s(y,y) eR 


and 


B = i(—s(x, y) a s(y,x)) = s(x, iy) + s(iy, x) 
= s(x + iy,x+ iy) — s(x,x) + s(ty, iy) ER. 


Since 


s(x, y) = (0 +i8) and s(y,x) = 5(0 — iB), 


s is hermitian. oO 


118 Chapter 3: Inner Product Spaces 


Theorem 3.1.11. Let V be a vector space over C and lets: VxV—>C 
be a hermitian form. Then 


sy) = be tyes) 1G yy) 


4 
+is(x+iy,x+i is(x —ty,x — ty)| 


for allx,y € V. 


Proof. The result is a consequence of the following equalities: 


s(x+y,x+y) —s(x—y,x—y) = 2s(x,y) + 2s(y,x) 
= 2s(x,y) + 2s(x,y) 
= 4Res(x, y) 


and 


s(x + iy,x + iy) — s(x —iy,x —iy) = 4Res(x, ty) 
== 4 Re(—is(x, y)) 
= 4Ims(x, y). 


The identity in the above theorem is a complex version of the polarization 
identity. As in the real case it implies the following useful property of hermitian 
sesquilinear forms. 


Corollary 3.1.12. [fs , and s2 are hermitian forms on a complex vector 
space V such that 


$1(X,X) = $2(x,x) forallx EV, 


$1(X,y) = 52(x,y) forallx,y EV. 


In particular, if s is a sesquilinear form such that s(x,x) = 0 for every 
x EV, thens=0. 


3.1. DEFINITIONS AND EXAMPLES 119 


Definition 3.1.13. A sesquilinear form s : Vx V > K is called a positive 
form if 


for every x € V. 
A positive form is called positive definite if 


whenever x 4 0. 


The condition s(x,x) > 0 implicitly assumes that s(x,x) € R for every 
x € V. Consequently, by Theorem 3.1.10, every positive form is hermitian. 


Example 3.1.14. The function s : M2 x2(C) x M2 x2(C) > C defined by 


Uy U2 VU, V2 = ae = — 
S : = Uz VI + UgQU2 + UZUZ3 + U4U4 
U3 U4 U3 VA 


is a positive definite sesquilinear form on M2x9(C). 


Now we are in a position to define the generalization of the dot product to 
arbitrary vector spaces. 


Definition 3.1.15. By an inner product on a vector space V we mean a 


positive definite sesquilinear form on Y. A vector space V with an inner 
product is called an inner product space. 


The inner product of two vectors x and y in V is denoted by (x,y). Below 
we list all properties that constitute the definition of an inner product. 


120 Chapter 3: Inner Product Spaces 


A function 
(,):V¥VxVoRK 


is an inner product on V if the following conditions are satisfied. 


y) = (x1,y) + (x,y) for all x1,x2,y EY, 

X,yi tye) = (x, yi) + (x, ye) ~— for all x, y1, yo € Y, 
for allx,y € Vanda€ K, 
for allx,y € Vanda€ K, 


2. (+: (x,y) =(y,x) forallx,y eV, 


3. (-,-) is positive definite: (x,x)>0 foralOAxeEYV. 


In view of the previous comments and Theorem 3.1.10, in order to verify 
that a function (-,-) : ¥V x Y + C is an inner product on V it suffices to check 
the following three conditions: 


(i) (a1a1 +.02%2,y) = a1 (x1, y) + @2(Xe, y) for all x1, x,y € V and ay, a2 € 
K, 


(ii) (x,y) = (y, x) for all x,y € V, 


(iii) (x,x) > 0 for all nonzero x € V. 


Example 3.1.16. The standard inner product in the vector space C” is de- 


fined by 
Ty Y1 a 
( eh )- oj = C1 +o + aT 
a 


= 
n Yn aI 


Example 3.1.17. The functions defined in Examples 3.1.4 and 3.1.14 are 
examples of inner products. 

The vector space C? is an inner product space with the inner product 
defined by 


U1 UL 1 
U2} , | v2 = 8u107 + V2ued9 + 5 U3U3- 


3.1. DEFINITIONS AND EXAMPLES 121 


More generally, for any positive real numbers qj,...,@,, the form 


oe. Y1 ty 
Bly lls ) = Saiay= oun bb ona 
iD 


= 
n Yn J 


is an inner product in C”. 
The vector space M2 y.2(C) is an inner product space with the inner product 
defined by 


Uy U2 VU, V2 peo ree (az. Ps 
; = U,V, + U2Vg + U3U3 + U4U4. 
U3 U4 U3 V4 


This example can be easily generalized to Mm xn(C) for any positive integers 
m and n. 


Example 3.1.18. The vector space Cjq,4)(C) of all continuous complex-valued 
functions on the interval [a, b] is an inner product space with the inner product 
defined by 


b 
(f,9) = i f(t)g(bdt. 


More generally, for any y € Cjq4j(C) such that y(t) > 0 for all ¢ € [a, 6], the 
form 


b 
(f,9) = f(t)ge(t)dt. 


is an inner product in Cjq,pj(C). 


Example 3.1.19. Let V = Cig i(C), the space of complex-valued functions 


on [0,1] with continuous second derivatives. Show that 


(f,9) = FO)a(0) + #0)" + i fr" (ta at 


is an inner product. 


Solution. The only nontrivial part is showing that (f, f) = 0 implies f = 0. 


122 Chapter 3: Inner Product Spaces 


Since 


(ff) = FOF) + £'(0)7O) + i f(y" Hat 


= | FO? + [FOP + i Lf" (e) Pat, 


if (f, f) = 0, then f(0) = 0, f’(0) = 0, and f”(t) = 0 for all ¢ € [0,1]. From 
f(t) = 0 we get f(t) = at+b. Then, form f(0) = 0 we get b = 0 and finally 
from f’(0) = 0 we get a = 0. Consequently, f(t) = 0 for all t € [0, 1]. oO 


Example 3.1.20. Let V = Cio 1(C) the space of complex-valued functions on 


[0, 1] with continuous derivatives. Show that 


= | f(g Hat 


is not an inner product. 


Solution. The defined function is not an inner product because it is not positive 
definite. Indeed, (f, f) = 0 implies f’ = 0, but this does not mean that f = 0 
because f could be any constant, not necessarily 0. 

Note that the defined function is a positive sesquilinear form. O 


The inequality in the next theorem, known as Schwarz’s Inequality, is one 
of the most important and useful properties of the inner product. 


Theorem 3.1.21 (Schwarz’s Inequality). Let V be an inner product 
space. Then 


I(x, y)|? < (x, x){y,y) 


for all x,y € V. 


Proof. If y = 0, then the inequality is trivially true since both sides are equal 
to zero. If y £0, then 


0 < (x+ay,x + ay) = (x,x) + a(x, y) + aly, x) + |a|?(y,y), 


for any a € C. If we let a= a oxat then the above inequality becomes 
(y, x) (x, y) lx, y) I? 
O< x, xX) — ( ’ a (y,x) + (y,y). 
7) — yy) vy) (vy)? 


3.1. DEFINITIONS AND EXAMPLES 123 


After multiplying the above inequality by (y, y) and simplifying we get 


0 < (x,x)(y,y) —|(x, y)], 


which is Schwarz’s inequality. O 


Theorem 3.1.22. Let V be an inner product space and let x,y € VY. 
Then 


I(x, y)|? = (x, x)(y, y) 


if and only if the vectors x and y are linearly dependent. 


Proof. Let x and y be linearly dependent vectors in Y. Without loss of gener- 
ality we can assume that x = ay for some a € K. Then 


I(x, ¥)|? = | (x, axx)|? = Jal? ((x, x))? 
= (x, x)a@(x, x) = (x, x)(ax, ax) = (x,x)(y,y). 


Now, we assume that x and y are vectors in V such that |(x, y)|? = (x, x)(y, y). 
Then (x,y)(y,x) = (x, x)(y, y) and consequently 


((y, ¥)* — (&,Y)Ys(Y, ¥)x — (x, y)y) 
= (y, y)?(x,x) — (y, y)(y,x) (x,y) — & y)(y,y)(y,x) + (x yy, x)(y,y) =0. 


This shows that (y, y)x— (x, y)y = 0, which implies linear dependence of x and 
y: oO 


The dot product in R” has an important geometric meaning. For example, 
if x and y are nonzero vectors in R® and x+y = 0, then the vectors are per- 
pendicular, that is, the angle between them is 90°. In general vector spaces 
we do not have that geometric interpretation. For example, what would “the 
angle between functions sint and cost” even mean? On the other hand, the 
importance of the dot product in R” goes far beyond its connection with the 
angle between vectors. Many of those properties and applications of the dot 
product extend to the inner product in general vector spaces. 


Definition 3.1.23. Let V be an inner product space. Vectors x,y € V 
are called orthogonal if 


(x,y) = 0. 


124 Chapter 3: Inner Product Spaces 


Example 3.1.24. Show that the vectors 
1+ 2% d —2-i 
Pe Nicaea (ai eee 
are orthogonal in the inner product space C?. 


Solution. 


((32]-[a)) cra +2-o=a~0 


Example 3.1.25. Consider the vector space of continuous functions defined 
on the interval [—1, 1] with the inner product 


(f,9) = he f(t)g@ dt. 


Show that an odd function f(t), that is a function such that f(—t) = —f(t) 
for all ¢ € [—1, 1], and cost are orthogonal. 


Solution. First we note that the function f(t) cost is odd since 
f(—t) cos(—t) = —f(t) cost 


Tor allt = |— 11). Elenee, 
0 1 
= d t dt 
(f(t), cost) [. f(t) cost +f f(t) cos 
1 1 
= -| ftyeostat + f f(t) cost dt = 0. 
0 0 


oO 


Another tool that plays an important role in the linear algebra of R” is the 
norm. The standard norm, called the Euclidean norm, is defined as 


Il(21,-.-,2n)I] = 


A norm in a general vector spaces V is introduced as a function ||-|| : V + [0, co) 
satisfying certain conditions. 


3.1. DEFINITIONS AND EXAMPLES 125 


Definition 3.1.26. Let V be a vector space. By a norm in V we mean 
a function |] - || : ¥V > [0,co) such that 


(a) |lx+yll < |lxll + Ilyll, 
(b) |lax|] = |ol]|x!], 
(c) ||x|| = 0 if and only if x = 0, 


for all vectors x,y € V and all numbers a € K. A vector spece with a 
norm is called a normed space. 


The inequality ||x + y|| < ||x|| + |ly|| is called the triangle inequality. 


Example 3.1.27. Here are two examples of norms in K”: 


n 
[l(@1,.--,2n)l] = 0 lea, 
k=1 


I[(t1,--+,&n)|] = max{|ar],...,|en|}- 


Example 3.1.28. The function 


b 
f= / LF (t)|at 


is a norm in the vector space Cjq,9)(C) of all continuous functions f : [a,b] > C. 


It turns out the inner product in a vector space defines in a natural way a 
norm in that space. That norm has the best algebraic and geometric properties. 


Theorem 3.1.29. Let V be an inner product space. The function 


Ilx<|| = v (x, x) 


is a norm in VY. 


Proof. First notice that ||x|| is well-defined because (x,x) is always a non- 
negative real number. Since the inner product is a positive definite form, 
||x|| = 0 if and only if x = 0. Moreover 


llox|| = (ax, ax) = V/aa(x, x) = |al||x|]. 


126 Chapter 3: Inner Product Spaces 


The triangle inequality follows from Schwarz’s inequality: 


Ix + yl? =(x+y,x+y) 

(x, x) + 2 Re(x, y) + (y,y) 

(x, x) + 2|(x,y)| + (y,y) 

\|<||? + 2IIx\| lly] + llyll? (by Schwarz’s inequality) 


(ll>eI] + llyll)?. 


Hence ||x + yl] < |[xl| + |lyll- O 


< 
< 


When we say the norm in an inner product space we always mean the norm 


Ilx|| = / (x, x). 


Example 3.1.30. The standard norm in the vector space C” is defined by 
the inner product 


which means that 
Ly 


In 


Example 3.1.31. Since 


ee ell (he 13; 


2— 31 — 2— 32] , 2-3] )=s41944= 22 
—2 —2 —2 
1+ 22 
the norm of the vector |2— 3i| € C? is V22. 
—2 


Example 3.1.32. Consider the vector space of continuous functions defined 


3.1. DEFINITIONS AND EXAMPLES 127 
on the interval [—7, 7] with the inner product 


(f,9) = = C f(t)g@ dt. 


7 
Find || sin nt|| for any positive integer n. 


Solution. Since 

|| sin nt||? = (sin nt, sin nt) : i (sin nt)? dt i 
in = (sin —— = 
“ ; T Das diaz 


7 


(1 — cos(2nt)) dt = 1, 


we have || sin n¢|| = 1. O 


Schwarz’s Inequality is often stated and applied in the form stated in the 
following Corollary. 


Corollary 3.1.33. Let V be an inner product space. Then 


(x,y)! < IlxIlllyll 


for all x,y € V. 


From Theorem 3.1.11 we obtain the polarization identity for the inner prod- 
uct expressed in terms of the norm. 


Corollary 3.1.34 (Polarization identity). Let V be an inner product 
space over C. Then 


1 Hane ere 
(y) = 5 (Ix ty? — [le — yl? + allx + ty]? — all> — éyll?) 


for allx,y € V. 


We close this section with a general version of the Pythagorean Theorem 
that we are all familiar with from geometry. 


Theorem 3.1.35 (The Pythagorean Theorem). [fu and v are orthog- 
onal vectors in an inner product space, then 


lu + vl? = llull? + [IvIl?. 


128 Chapter 3: Inner Product Spaces 


Proof. If u and v are orthogonal, then (u,v) = (v, u) = 0 and thus 
lu + vl? = (ut+v,utv) = (u,u) + (u,v) + (v,u) + (v,v) = |lull? + Ivll?. 


O 


3.2 Orthogonal projections 


In an elementary calculus course you have probably seen an exercise similar to 
the following one: 


Find a point on the line 2x + 5y =7 that is closest to the point (3,6). 


In calculus we use derivatives to find such a point. Linear algebra offers 
a simpler and more elegant way of solving the above problem. Moreover, the 
algebraic method generalizes in a natural way to more difficult problems. For 
example, it would be rather difficult to use methods form calculus to solve the 
following problem: 


Find numbers a, b, and c that minimize the integral i let —a-—bt— ct?|” dt. 


We will see in this section that both problems are quite similar from the 
point of view of linear algebra and that the method uses the inner product in a 
substantial way. 


3.2.1 Orthogonal projections on lines 


We start by considering the simplest case of an orthogonal projection in 
a general vector space, namely, projection on a one-dimensional subspace. In 
R? we observe that, if u is a nonzero vector and v is a point not on the line 
through u and the origin, then the point p on that line that is the closest to v 
is characterized by the fact that (v — p,u) = 0. This property generalizes to 
any inner product space. 


3.2. ORTHOGONAL PROJECTIONS 129 


Theorem 3.2.1. Let V be an inner product space and let u,v € V. If 
u #0, then there is a unique vector p € Span{u} such that 


(v —p,u) =0. 


Proof. If p € Span{u}, then there is a number a € K such that p = au. If 
(v —p,u) = 0, then 


(v — p,u) = (v—au,u) = 0. 


u 
The equation (v — au, u) = 0 has a unique solution a = ~ - . Consequently, 
(v,u) 
p= aa is the unique vector in Span{u} such that (v — p, u) = 0. Oo 


Definition 3.2.2. Let V be an inner product space and let u € V be a 
nonzero vector. For v € V we define 


] — Uu= u. 
PIO lspantut( ) lig 


The vector projgpan{u} (Vv) is called the orthogonal projection of v on the 
subspace Span{u}. 


If u is a unit vector, that is, ||u|] = 1, then the expression for the projection 
on Span{u} can be simplified: 


PTOjgpan{u} (v) = (v, u)u. 


Assuming that u is a unit vector is not restrictive, since for any u 4 0 we have 


Span{u} = Span { man 


I|ul| 


Note that proj : V > VY is a linear operator. 
Span{u} 


Example 3.2.3. In the inner product space C? find the projection of the 
1 a 

vector |1] on Span —1 
1 


130 Chapter 3: Inner Product Spaces 


Solution. Since 


a a 1 a 
—1],}-1 ) =3 and 1] ,}-1 ) = -1, 
1 1 


1 1 
1 z 
the projection of the vector |1] on Span ¢ |—1 is 
1 1 
1 
=; 3 
= ie 
5 | ee (eel 
1 —si 


Example 3.2.4. Consider the vector space of continuous functions defined on 


the interval [—7, 7] with the inner product 


(f,9) = - ( f(t)g@ dt. 


Find the projection of the function f(t) = ¢ on Span{sinnt} where n is an 
arbitrary positive integer. 


Solution. The projection is 
(t, sin nt) eee 
in nt. 
(sin nt, sin nt) 


Since 


cos “) sin nt 


: cos nt 1 
tsinnt dt =t( — +— | cosntdt=t| — a 
n nr n n 


we have Te 
(t, sin nt) = - | tsin nt dt = (—1)"*1-. 
TT nm 


Tt 


3.2. ORTHOGONAL PROJECTIONS 131 


And, since 


itt Tw 


— (1 — cos 2nt) dt = 1, 
PAG ae 


1 Tv 

(sin nt, sinnt) = || sinnt||? = -/ (sin nt)? dt 
T Jin 

the projection of the function t on Span{sin(nt)} is 


2 
(—1)"*" = sin nt. 
n 


Example 3.2.5. In the vector space C” with the standard inner product the 


Ty Y1 
projection of the vector | : | on Span : is 
In Yn 
Ty Y1 
allege ) Y1 Ty Y1 
Ln Yn , 1 — se ; 5 
eee bel we reese [wi : Tn : : 
- up 5 Om ae 
Un Yn 
Y Ty 
l p= = 
~ TP a2 : [vi oe Tn | : 
[| Yn rn 
Ll 
Y1 Y1 
The matrix | : [nm aah Tn] is called the projection matrix on Span 


Yn Un 


The following theorem establishes properties of projections in general vector 
spaces that should look familiar from our experience with projections on lines in 
IR? and R®. Part (a) says that the projections is the unique vector minimizing 
the distance. In part (b) we say that the magnitude of the projection cannot 
exceed the magnitude of the original vector. And in part (c) we show that the 
projection and the original vector are the same if and only if the original vector 
is on the line. 


132 Chapter 3: Inner Product Spaces 


Theorem 3.2.6. Let V be an inner product space and let u be a nonzero 
vector in V. For allv € V we have 


(a) |lv — projgpantuy(¥)|| < ||v — Bull for every B € K such that 


BA aa F 


(b) |[PTOjgpan{u} (V)| < lvl; 


(c) projgpanfuy(V) =v if and only if ~v € Span{u}. 


Proof. (a) From Theorem 3.2.1 we get 
(v — prOjgpanfu}(V), au) =a (v Pio lesantud Vs w) =O 
for any a € C. Consequently, for any 8 € C we have 
(v= proispan(u)(¥): PFOispan(uy(¥) — Bu) = 0, (3.1) 
because projgpantu}(V) — Bu € Span{u}. From (3.1) we get 


Ilv -— PTOjgpan{u} (v)||? + |[PTOjgpanfu} (v) ~. Bull? = Ilv ~ Bull’. (3.2) 


Hence 
Ilv _ PlOjgpan{u} (v)I| < Ilv -_ Bull 


whenever ||PrOjgpantuy (Vv) — Bull 4 0, that is, for every 6 # ae 
(b) If we let 6 = 0 in (3.2), then we get 


Ilv +4 PYOjgpan{u} (V)II + |[PrOjgpanfu} (v)||? = IIvil?. 


Consequently 
I|PrOjgpan¢uy(V) I] < IIvll- 


(c) If PYOjgpan{u} (v) =V, then 
v= PLOjgpan{u} (v) i Span{u}. 
Now, if v € Span{u}, then v = au for some a € C and thus 


(v, u) (acu, u) 


PYOjgpan{u} (v) = (u, u) (u, u) 


O 


From the properties of projections proved in Theorem 3.2.6 we can obtain 
the results in Theorems 3.1.21 and 3.1.22 in a different way. 


3.2. ORTHOGONAL PROJECTIONS 133 


Theorem 3.2.7. Let V be an inner product space. For all x,y € V we 
have 


(a) |, y)] < Ilxilllyll 


(b) |(x, y)| = ||xIl|ly|| af and only if x and y are linearly dependent. 


Proof. (a) Since the inequality is trivial when x = 0, we can assume that x 4 0. 
By Theorem 3.2.6 we have ||projgpanfx}(¥) || < |lyl], which means that 


(x, y)| 
I|||? 


Ilxll < llyll- 


Hence |(x, y)| < ||xllly|l. 
(b) Again, without loss of generality, we can assume that x 4 0. Then the 
following statements are equivalent: 


x and y are linearly dependent 
y € Span{x} 
I1Projgpaney (VIII = Ilyl| (by (c) in Theorem 3.2.6) 


(x, y)| 
x2 IIxll = lly 


I(x, ¥)] = [lllIlyll 


3.2.2 Orthogonal projections on arbitrary subspaces 


Now we generalize orthogonal projections to arbitrary subspaces. While some 
properties remain the same, some aspects of projections become more compli- 
cated when the dimension of the subspace is more than one. 

We use the property described in Theorem 3.2.1 to define an orthogonal 
projection on a subspace. 


Definition 3.2.8. Let U/ be a subspace of an inner product space V and 
let v € V. A vector p € UY is called an orthogonal projection of the vector 
v on the subspace U if 


(v —p,u) =0 


for every vector u € U. 


Note that it is not clear if the projection always exists. This question will 
be addressed later. 


134 Chapter 3: Inner Product Spaces 


Theorem 3.2.9. Let U be a subspace of an inner product space V and 


letv € V. If an orthogonal projection of v on the subspace U exists, then 
it is unique. 


Proof. Assume that both p; and pz are orthogonal projections of v on the 
subspace U, that is, (v — pi1,u) = (v — po,u) = 0 for every u € U. Since 
Pi, P2 € U, we have 


0 = (v— pi, p2) = (v, p2) — (P1, P2) 


and 

0 = (v — po, P2) = (Vv, P2) — (P2, P2) = (Vv, P2) — \|p2||”. 
Consequently, (pi, p2) = ||pe||?. Similarly, we can show that (pz, p1) = ||pil|?. 
Hence 


\|P1 _ p2||? = (Pi — Pp2,P1 — P2) _ \|p1||? _ (P1, P2) _ (P2, P1) Ir \|P2||? =0, 


proving that p; = pg. oO 


The unique orthogonal projection of a vector v on a subspace U/ is denoted 
by projy(v). 


Example 3.2.10. Consider the vector space V of continuous functions defined 
on the interval [—7, 7] with the inner product 


Gas | Hoque 


~ On Ae 
and its subspace U = Span{1,cost,sint}. Show that the function 2sint is a 
projection of the function ¢ on the subspace U/. 


Solution. Since 


1 Tv 
(t —2sint, 1) = mall (t — 2sint) dt = 0, 


1k 


TT 


1 
(t — 2sint, cost) = =| (t — 2sint) cost dt = 0, 
Tle 
and ee 
(t — 2sint, sint) = mall (t — 2sint) sint dt = 0, 
TT 


Te 


we have proj,,(t) = 2sint. O 


3.2. ORTHOGONAL PROJECTIONS 135 


Example 3.2.11. Consider the vector space V of continuous functions defined 
on the interval [—1, 1] with the inner product (f,g) = ie f(t)g(t) dt. Find a 
projection of a function f € V on the subspace € of even functions. 


Solution. First we note that, if f € V, then the function $(f(t) + f(—t)) is 


even and, for any function g € €, the function (f(t) — f(—t))g(t) is odd. Since 


[ (s@- 500 +1-—) a= 5 f GO - 1-H a =0 


=k —1 


for any function g € €, we conclude that proje(f(t)) = $(f(t)+ f(t). O 


Definition 3.2.12. Let U/ be a subspace of an inner product space VY and 
let v be a vector in VY. A vector p € U is called the best approximation 
to the vector v by vectors from the subspace U if 


lv — pll < [lv — ull 


for every u €U such that u ¥ p. 


Note that the definition of the best approximation implies that, if a vector 
has a best approximation by vectors from the subspace, then that approximation 
is unique. 

A projection of a vector v € V on a subspace U/ can be interpreted as the 
best approximation of v by vectors from UU. This point of view is natural in 
many applications. 


Theorem 3.2.13. Let U be a subspace of an inner product space V and 
let v be a vector in V. The following conditions are equivalent: 


(a) p €U is the orthogonal projection of v on U; 


(b) p EU is the best approximation to the vector v by vectors from U. 


In other words, (v—p, u) = 0 for every u € U if and only if ||v—p]| < ||v—ul| 
for every u €U such that u ¥ p. 


Proof. If p is an orthogonal projection of v on U, then (v — p, u) = 0 for every 
vector u € U. Let q be an arbitrary vector from U. Then 


lv -all? =|lv-p+p-—all’. 


136 Chapter 3: Inner Product Spaces 


Since p— q €U, we have (v — p, p — q) = 0 and thus 
lv — all? = lv — pl? + lp — all’, 


by the Pythagorean Theorem 3.1.35. Hence the vector q = p is the best ap- 
proximation to the vector v by vectors from U. This shows that (a) implies 
(b). 

Now assume that the vector p is the best approximation to the vector v 
by vectors from U. Let u be an arbitrary nonzero vector in U and let z be a 
number from K. Then 


lv — p + zul|? > lv — pl? 
and consequently 
|z|?|lull? + Z(u, v — p) + 2(v — p,u) > 0. 


If we take z = t(v—p, u), where ¢ is a real number, and suppose that (v—p, u) 4 
0, then we get 


t?||ul|? + 2t >0 


for every real numbers t, which is not possible because ||u|| 4 0, and thus 
(v — p,u) = 0. Since u is an arbitrary nonzero vector in U, p is an orthogonal 
projection of v on UU. Therefore (b) implies (a). Oo 


The next theorem can be helpful when calculating the projection of a vector 
on the subspace spanned by vectors uy,..., Uz. 


Theorem 3.2.14. Let u,,Uo,..., Ux be vectors in an inner product space 
V and letv € VY. The following conditions are equivalent: 


(a) Dl Ojepaniaicsup\) = @1Uy + ©QUgQ +++ + LEUg; 


%1(Wi,U1) + @2(u2,u1) + +++ + ap (Up, U1) 
21(W1,U2) + @2(U2,U2) + +++ + ap (UK, U2) 


1(U,Ug) + Go(U2,uUg) + ++» + Ze(UK, UR) 


Proof. projgpanfuy,....u,}(V) = T1U1 ++ +++ cpu if and only if (v—p,u) = 0 for 
every u € Span{uy,...,u,}. Since every u € Span{uj,...,ux} is of the form 
u = yu; +---+ yeux, for some yi,...,yx € K, the equation (v — p, u) = 0 is 
equivalent to the equations 


(v — p,uz) = 0, (v — p, uz) = 0,...,(v—p, uz) =0 


3.2. ORTHOGONAL PROJECTIONS 137 


or 
(v = (a1U41 fees Ht LEUK), U1) = 0, 
(v _— (a1 U4 feet LEUZ), U2) = 0, 
(v — (a1u, +--+» + a,x), UR) = 0, 


which can also be written as 


£1 (Uy, Uy) il + © (Ug, U1) = (v, U1), 


£1 (Uy, U2) a ce + © (Ug, U2) = (v, U2), 


©1 (Wy, Ug) +++ + 2 (Uz, UZ) = (V, Ug). 


0 1 i 
Example 3.2.15. Find the projection of the vector | 1} on Span ¢{ | 1}, | 0 
0 1] |0 
in the inner product space C°. 
Solution. We have to solve the system 
1(U1,,U1) + 22(u2,u,) = (v,u2) 
@1(U1,U2) + @2(U2,U2) = (v,U2)’ 


1 i 0 
where u; = }1], uo = |0], and v= |} 1], that is, the system 
1 0 0 


ee at 1x2 = 


Since the solutions are 7; = 4 and x2 = $, the projection is 


aes: 
II 
NIF NIF © 


138 Chapter 3: Inner Product Spaces 


0 

It is easy to verify that (v- H = 0 and (v- El +) = 0. oO 
1 1 
2 Bl 


We are now in a position to prove that the orthogonal projection on a finite 
dimensional subspace of an inner product space always exists. We show that by 
showing that the system of equations in Theorem 3.2.14 always has a unique 
solution. 


Theorem 3.2.16. Let U be a finite dimensional subspace of an inner 


product space V. For any vector v € VY the orthogonal projection of v on 
the subspace U exists. 


Proof. Let U = Span{uy,...,ux,}. Without loss of generality we can suppose 
that the vectors u,,...,U, are linearly independent. In view of Theorem 3.2.14 
it suffices to show that the matrix 


(uy,U1) ... (Ug, Uz) 
(uy, Ug) ... (Ug, UR) 
is invertible. If 
(uy,U1) ... (ug,ur)] far () 
(W1,Uz) ... (Ug, UR)} |e 0 
for some 21,...,2% € K, then 


(x1 Uy feee Ht LRUK, U;) =—0 


for 1 <j <k. Hence 


(yu + +++ + apUg, Uy + +--+ apug) = |lviuy +--+ + cpuR| a) 
and thus 
vu, +---+rpuz = O. 
Since the vectors u,,..., Ux are linearly independent, we get 7} =--- =a, = 0. 


O 


The assumption that the subspace UY in the above theorem is finite dimen- 
sional is essential. For infinite dimensional subspaces the projection may not 


3.2. ORTHOGONAL PROJECTIONS 139 


exist. For example, consider the space VY of all Oaude a functions on the in- 
terval [0,1] with the inner product (f, g) = ia g(t) dt and the subspace U of 
all polynomials. Since there is no polynomial p oo me 


| (e* — p(t)) q(t) dt =0 


for every polynomial g, the function e’ does not have an orthogonal projection 
on the subspace of all polynomials. 


3.2.3 Calculations and applications of orthogonal projec- 
tions 


Theorem 3.2.14 gives us a method for effectively calculating projections on sub- 
spaces spanned by arbitrary vectors uj,...,ugz. It turns out that the calcula- 
tions are significantly simplified if the vectors u,,...,Uz are orthogonal. 


Definition 3.2.17. Let vi,..., Vv, be vectors in an inner product space 
Y. We say that the set {vi,...,v,} is an orthogonal set if (vi,v;) = 0 
for all i,j =1,...,k such that 14 7. 


An orthogonal set {vi,...,Vv%} is called an orthonormal set if ||v;|| = 1 
for alli =1,...,k. 


The condition of orthonormality is often expressed in terms of the Kronecker 
delta function: 
1 ift=J, 
bij = aw 
0 iftFAz. 


Using the Kronecker delta function we can say that a set {vi,...,Vvx} is or- 
thonormal if (v;,vj) = 6i; for all i,j € {1,...,k}. 


Theorem 3.2.18. Let {ui,...,ux} be an orthogonal set of nonzero vec- 
tors in an inner product space V and let U = Span{uy,...,ug}. Then 


v,u v,u 
Projyv = ce + sara + AVM) 
(uy, U1) 


for every vector v in VY. 


Proof. Let 


140 Chapter 3: Inner Product Spaces 


Then, for every j € {1,...,k}, we have 


v—pou) =(v— Sy, 


ui, U1) (uz, ux) 
= (vs) — SE an wy) Ty) 
(ui, U1) (Uz, Uk) 
= (sj) - FEI (uu) =0 
Since every u € Span{uj,...,ux} is of the form 
u= %U, + +--+ 2EUR, 
for some £1, %2,..., 2% € K, it follows that 
(v—p,u) =0 
for every u € Span{uj,...,u,}, which means that p = projyv. oO 


Note that Theorem 3.2.18 implies that, if {u1,...,u,} is an orthogonal set 
of nonzero vectors, then the best approximation to the vector v by vectors from 


the subspace Span{uj,..., ux} is the vector 
v,u v,u 
(uj, U1) (Uz, Uk) 
If {uj,...,ux} is an orthonormal set, then the formula for the projection 


becomes even simpler. 


Corollary 3.2.19. Let {ui,...,ux%} be an orthonormal set in an inner 
product space V and let U = Span{uy,...,ux}. Then 


projyv = (v,u,)uy +--+ + (v, ux) UX 


for every vector v in VY. 


Example 3.2.20. Consider the vector space of continuous functions defined 
on the interval [a, 6] with the inner product 


2D) 
B-a 


B — 
(f,9) = i f(t)g(t) dt. 


Show that the set {cos Ae sin sett is an orthonormal set and thus for every 


3.2. ORTHOGONAL PROJECTIONS 141 


function f continuous on the interval [a, 3] we have 


= 16% 


rt 10rt 
projy(f) = eof f(t) eos a coh is zof f(t) ar ~ dt sin un ; 
—a B-a 


where U = Span {cos Bat sin oat } 


Solution. First we recall that for an arbitrary positive integer k we have 


B - 5 
/ a 2kat = a—p (cos 2khra ee a) 


B-a 2ka B-a B-a 
a—B 2Qkn(a — B+ B) 2krB 
Se yrs (cos MEA) — cos Ae 


cos 


pe ae sin f= a = Fas 


Now we use the trigonometric identity 


ial (cos MC") QknB — . 2kn(a—B) . 2kmB aN 


sina cos B = 5 (sin(a + 8) +sin(a — 8)) 


and the above result to calculate the inner product: 


Qnt . 10nt 2 iP ee Qnt . 107t re 
cos , sin = — in 
to 3 =a BO), 2a pa 


1 [ iy Sem ee OTD a re 
= Sea | std sin = 
poe 2 po =o, 
This proves orthogonality of the set. Using a similar approach we can show 
that 


2nt 
B-a 


cos 


? 


“= faim _ 107t 
oe 


which completes the proof of orthonormality. Then the formula for the inner 
product (f,g) follows from Corollary 3.2.19. oO 


Example 3.2.21. Let P2({[—1, 1]) be the space of complex valued polynomials 
on the interval [—1,1] of degree at most 2 with the inner product defined as 


Show that { 5, st, JE GP - yt is an orthonormal set in P2([-1, 1]). 


142 Chapter 3: Inner Product Spaces 


Solution. We need to calculate the following integrals: 


Leal jit = ef tdt = 0, 
avs Ge- 5) os ane ol 
[ VanB Ge-B) a fi vee -ya=o, 
[ (ay ata 
(VB) a-¥ fous 


Example 3.2.22. Let V be an inner product space and let {vi,v2} be an 


orthonormal set in VY. Show that the set { Seve +vi), Fave = vi)} is or- 
thonormal. 


Solution. We have 


(Folva + vis gle — v1)) = 5 (Wa. ¥2)—(vasvi) + (v2.1) “(Wi vn) =O. 
|p eh eo + vi), (02 +) 


| 


((v2, V2) + (v1, V1)) = 1, 


lI 
mle wl eRe -™ 
sl 


and similarly 


3.2. ORTHOGONAL PROJECTIONS 143 


For many results on subspaces U = Span{uj,...,u,} it was necessary to 
assume that the vectors u,,...,u% were linearly independent. It turns out 
that, if ui,..., Uz are nonzero orthogonal vectors, then they are always linearly 
independent. Consequently, any orthonormal set is linearly independent. 


Theorem 3.2.23. If {vi,..., vx} is an orthogonal set of nonzero vectors 


in an inner product space VY, then the vectors vi,...,Vx% are linearly 
independent. 


Proof. If {vi,..., Vx} is an orthogonal set of nonzero vectors and 
U1Vit-::+2pV_, = 0 
for some numbers #1,...,2% € K, then for any j € {1,...,k} we have 
(21V1 + GoV2 + +++ + 2eVE, Vj) = £1(V1, Vz) + L2(V2, Vj) +++° + UK (VE, Vy) 
= 2;(Vvj,Vj) = «||vi ||’. 
On the other hand, 
(a1V1 + @2v2 + +++ + 2eVE, Vj) = (0,v;j) = 0. 


Since xj||vi||? = 0 and ||v;|| 4 0, we must have x; = 0. Consequently, the 
vectors V1,...,Vx are linearly independent. O 


3.2.4 The annihilator and the orthogonal complement 


In Chapter 1 we introduced the notion of complementary subspaces: If U is a 
subspace of a vector space VY, then a subspace W is called a complement of U/ in 
Vif V=UGW. We pointed out that such space is not unique. In inner product 
spaces we can define orthogonal complements that have better properties. 


Definition 3.2.24. Let A be a nonempty subset of an inner product 
space VY. The set of all vectors in V orthogonal to every vector in A is 
called the annihilator of A and is denoted by At: 


At ={x eV: (x,v) =0 for every v € V}. 


If U is a subspace of V, then U+ is called the orthogonal complement of 
U. 


From the definition of the annihilator and basic properties of the inner prod- 
uct we get the following useful result. 


144 Chapter 3: Inner Product Spaces 


Theorem 3.2.25. Let A be a subset of an inner product space VY. The 


annihilator A+ is a subspace of V. 


Example 3.2.26. Show that 


(f,9) = f(O)g(0) + f"(0)g’(0) + fF” (0)g""(0) 


is an inner product in the vector space P2(C) and determine (Span{t? + 1})+. 


Solution. First we note that (f,g) is an inner product because 


(ait? + Bit +71, a2t? + Bot +3) = W172 + BiB + 4019. 


Now we describe (Span{t? + 1})+. Since 
(at? + Bt+7,t7 +1) =4at+y¥ 
we have 
(Span{t? + 1})+ = {at? + Bt +7: 4a+7=0} = Span {t,t? — 4}. 


O 


If U is a subspace of an inner product space Y, then U+ is a subspace of V, 
so it makes sense to consider the subspace (U+)+. How is this subspace related 
to U2? If u € U, then u is orthogonal to every vector in U+, so u € (U+)+. 
This means that UY C (U+)+. In general, U and (U+)+ need not be equal. 
For example, consider the space Y of all aay dt. functions on the interval 
[0,1] with the inner product (f, 9) ei g(t) dt and the subspace U of all 
polynomials. It can be shown that, for any dae function f, if 


| f(t)qp dt = 


for every polynomial g, then f = 0. This means that U+ = {0} and thus 
(U-)- =V#U. 
If we assume that U is finite dimensional, then we can show that U and 
(uU-)~ are equal. 


Theorem 3.2.27. If U is a finite dimensional subspace of an inner 
product space V, then 


(ut)+ =U. 


3.2. ORTHOGONAL PROJECTIONS 145 


Proof. We need to show that (U+)+ CU. Let v € (U+)+. If UY is finite 
dimensional, then proj,,(v) exists, by Theorem 3.2.16. Since (w, v) = 0 for every 
w €U?+ and v — proj,(v) €U+, we have (v — proj, (v), v) = 0. Consequently, 


0 = (v — projy(v), v) 
= (v — projy(v), v — projy(v) + projy(v)) 
= (v — projy(v),v — projy(v)) + (v — projy(v), projy(v)) 
= (v — projy(v), v — projy(v)) 
= |v — projy(v)||’, 


which means that v — proj,(v) = 0. Thus v = proj,(v), which implies v € 
U. oO 


Theorem 3.2.28. Let V be an inner product space and let U be a finite 
dimensional subspace of V. Then for every v € V the projection of v on 


U+ exists and 
projy.(v) = v — projy(v). 


Proof. First we note that (v — proj,(v),u) = 0 for every u € U/, which means 
Vv — proj (v) € Ut. Moreover, for every w € U+ we have 


(v — (v — projy(v)),w) = (projy(v),w) = 0. 


Therefore v — proj,,(v) is the projection of v on Ut. O 


Theorem 3.2.29. For any finite dimensional subspace U of an inner 
product space V we have 


v=Uoul. 


Proof. For every v € V we have 
v = proj, (v) + projy.(v), 


by Theorem 3.2.28. Hence V=U+U+. If x EUNU-, then ||x||? = (x,x) =0 
and thus x = 0, which means that V =U @U+. oO 


In the next theorem we list all basic results on orthogonal projections on 
finite dimensional subspaces. We assume that the subspace U/ is finite dimen- 
sional to ensure that the proj,;(v) exists for every v € V. If we replace the 
assumption that UW is finite dimensional by the assumption that the proj,,(v) 
exists for every v € V, the theorem remains true. 


146 Chapter 3: Inner Product Spaces 


Theorem 3.2.30. Let U be a finite dimensional subspace of an inner 
product space V. Then 


(a) p=projy(v) if and only if (v—p,u) =0 for every u EU; 


(b) proj,(v) is the best approximation to the vector v by vectors from 
the subspace U, that is, p = proj,(v) if and only if ||\v — pl| < 
|v — ul] for every uc U such that uF p; 


(c) projy:V— V is a linear transformation; 
(d) u= proj,(u) for every u EU; 
) 


d 
(e) ran proj, =U; 
( 


f) ‘ker proj, = 3 


(g) projys = Id—proj,,; 
(h 


) proj, (projy(v)) = projy(v) for every v € V; 
) 


(i) (projy(v), w) = (v, proj,(w)) for every v,w € V. 


Proof. (a) is the definition of orthogonal projections (Definition 3.2.8); 
(b) is the statement in Theorem 3.2.13; 
(c) If v,w € V anda, € K, then 
(av + Bw — aprojy(v) + Bprojy(w), u) = a(v — projy,(v), u) + B(w — proj,,(w), u) = 0 
for every u € YU. Hence 
proj (av + Bw) = aproj,(v) + Bprojy,(w); 


(d) For every u € U we have (u—u, u) = 0, which means that u = proj,,(u); 
(e) follows from (d) and the definition of the projection; 
(f) If v € ker proj,,, then for every u € U we have 


(v,u) = (v — proj,,(v), u) = 0, 


which means that v € U+. Now, if v € U+, then (v — 0,u) = (v,u) = 0 for 
every u € U, which means that proj,,(v) = 0; 

(g) is equivalent to the statement in Theorem 3.2.28; 

(h) is a consequence of (d); 

(i) For any v, w € V we have 


(proj,,(v), w) = (projy(v), w — proj, (w) + proj, (w)) 
= (projy(v), projy(w)) 
= (v — projy(v) + proj, (v), projy(w)) 
= (Vv, projy(w)). 


3.2. ORTHOGONAL PROJECTIONS 147 


It turns out that properties (c), (h), and (i) in Theorem 3.2.30 characterize 
orthogonal projections. 


Theorem 3.2.31. Let V be an inner product space. If f :V > V is a 
linear transformation such that 


(a) F(f(*)) = f(x) for every x € V, 


(b) (f(x), ¥) = x, f(y)) for every x,y € V, 


then f is the orthogonal projection on the subspace ran f. 


Proof. Assume that f : VY > V is a linear transformation satisfying (a) and (b). 
If x,y € V, then 


(v — f(x), f(y) = & fly) — (F(X), Fly) 
= (x, f(y)) — & Fy) 
(x, f(y)) — (x, fly)) = 0. 
This means that 
f(x) = Projran ¢(X) 
for every x € V. oO 


3.2.56 The Gram-Schmidt orthogonalization process and 
orthonormal bases 


In Corollary 3.2.19 we noted that calculating the projection on the subspace 
U = Span{uj,..., ux} is especially simple if {uy,..., u,} is an orthonormal set. 
We are going to show that every finite dimensional subspace can be spanned by 
orthonormal vectors. This is accomplished by modifying an arbitrary spanning 
set by what is called the Gram-Schmidt process. We motivate the idea of the 
Gram-Schmidt process by considering a couple of examples. 


Example 3.2.32. Find a vector v € C? such that 


1 1 a a 
Span 1} ,v p = Span 1] ,|-1 and (« -1 ) = 0. 
1 


Chapter 3: Inner Product Spaces 


148 
a 
Solution. Let U = Span —1 By Example 3.2.15, we have 
1 
1 
1 3 
projy 1 = zi 
—si 
Consequently, we can take 
1 1 3 
v= 1/1] — proj, | |1| ] = |1- 4 
- : 1+ hi 
oO 


Example 3.2.33. Let u,,...,U,, be nonzero orthogonal vectors in an inner 
product space V and let v be a vector in V such that v ¢ Span{uy,..., Um}. 
Find a nonzero vector U4, such that 


Wein Spams Wy, es) Ue, be 


and 
Span{uy, -++,Um, etal: a Span{uz, -++,Um, v}. 
Solution. By Theorem 3.2.18, we have 
: (v, uz) (v, uz) (Vv, Um) 
pro any Uy1,...,Um MW) = Ar ™* 
Besa 2 Terre (eats) (thy tn) 
We take 
Ui =Vv— (v, u1) u,— (v, uz) (v, Um) 
(u4, u1) (ua, uz) (i, Un) 
Clearly, un+i 4 O and 
Span{uy, -++,Um, v} = Span{uy, -++,Um;, Um+1}- 
Moreover, since 
(Um+1) uj) - (v a PlOjepanjur aa) Um} (v), u;) =0 
a ae O 


for 7 =1,...,m, we have un41 € Span{uy, 


The method used in the above example leads to the following general result 


3.2. ORTHOGONAL PROJECTIONS 149 


Theorem 3.2.34. For any linearly independent vectors uy,...,Um in 
an inner product space Y there are orthogonal vectors v1,...,Vm in V 
such that 


Span{uy,...,uz} = Span{vi,..., ve} 


for every k € {1,...,m}. 


Proof. Let U;, = Span{uy,...,ux} for k € {1,...,m}. We define v; = u; and 
then successively 
Vi = Ug — projy, _, (uk) 


for k € {2,...,m}. Since ug, ¢ Up_1, we have vz, 4 0. 
If U,z_1 = Span{vj,...,Vx—-1} for some k € {2,...,m}, then 


Ug = Ve + projy, _, (Ux) = Ve + PrOjspan{vi,...,ve_1} (Ue) € Span{vi,..-, Ve} 


and consequently 
U;, = Span{vi,..., Ve}, 


because v;, € Uy, for every k € {1,...,m}. This shows by induction that 
Span{uy,...,u,} = Span{vi,..., ve} 
for every k € {1,...,m}. 
To finish the proof we note that, by part (a) of Theorem 3.2.30, (vz, u) = 0 
for every u € Ug_; and every k € {2,...,m}. Hence 
(ve, V1) ay (Vk, Vk—1) _ 0, 


because v1,...,Vz—1 € Ug_1.- O 


Note that the above proof describes an effective process of constructing an 
orthogonal basis of a subspace from an arbitrary basis. This process is called 
the Gram-Schmidt orthogonalization process. 


Example 3.2.35. Let uj,u2,ug,u4 be linearly independent vectors in an 
inner product space V. Find an orthogonal set {v1, V2, v3, v4} such that 


uy = Vi, 
Span{uz, ue} = Span{vi, ve}, 
Span{uj, u2, us} = Span{vi, va, vs}, 


and 


Span{uy, ug, ug, uz} = Span{vy, va, vs, v4}. 


150 Chapter 3: Inner Product Spaces 


Solution. We take 


vi =U), 
U2, Vv 
v2 = U2 — wep sa) 1; 
(v1, V1) 
uz, V u3,V 
Wows u ee 2) vy, 
(v1, V1) (V2, V2) 
and 
(ua, V1) (ua, V2) (ua, V3) 
v4 = U4 — 3 
(vi, V1) (v2, V2) (v3, V3) 
O 
In the first section of this chapter we proved that |/u + v||? = |/ul|? + ||v||? 


for any orthogonal vectors u and v (the Pythagorean Theorem 3.1.35). This 
property easily generalizes to any finite set of orthogonal vectors. 


Theorem 3.2.36 (The General Pythagorean Theorem). For any or- 
thogonal vectors V1,...,Vn in an inner product space V we have 


Ilva +--+ + Vall? = [lvill? +--+ [lvnll?. 


Proof. For any orthogonal vectors v1,...,Vn we have 
2 
n n n n n n 
2 
Sif = (Sov dove) = Yo bod = Stein = Sob 
j=l j=l k=l j,.k=1 j=l j=l 
O 
If u,,...,Um are linearly independent vectors, then the vectors vj,...,Vm 
obtained by the Gram-Schmidt orthogonalization process are also linearly inde- 
pendent and thus they are nonzero vectors. By normalizing vectors v1,...,Vm 


we obtain an orthonormal set 


{ al Vin \ 
Ivill? 7 WMmll J 


The process of obtaining an orthonormal set from an arbitrary linearly inde- 
pendent set is called the Gram-Schmidt orthonormalization process. 


3.2. ORTHOGONAL PROJECTIONS 151 


Corollary 3.2.37. For any linearly independent vectors u,,...,Um in 
an inner product space Y there are orthonormal vectors wj,...,Wm in 
V such that 


Span{uj,...,u,} = Span{wi,..., we} 


for every k € {1,...,m}. 


Example 3.2.38. We apply the Gram-Schmidt orthonormalization process 
to the set {1, t, t?} in the vector space of polynomials on the interval [0, 1] with 
the inner product 


oe i f(g at. 


First we define fo(t) = 1. Since 


1 
Iyol? = f ia — 
0 


we let go(t) => fo(t) — 
Next we find f;: 


il 
ft) =t— (yy =t— f idt = t— 5. 


1 2 
1 1 
2 
= —— d= — 


1 1 
gi(t) = Ta = 2/3 (+ 5) 


Since 
we define 


Now we find fo: 


ne e-(eaale-$))2a(e-) 


152 Chapter 3: Inner Product Spaces 


il 2 
fl? =f (e045) a= 
é 6 180’ 


ga(t) = Taye = 6V5 (? =e 3) 


Since 
we define 


By applying the Gram-Schmidt orthonormalization process to the set 
{1,t,t?} we obtain the following orthonormal set 


{1.2v3(1-5) .ev5(e@-r4e) 


Example 3.2.39. Use the result from Example 3.2.38 to find the best ap- 
proximation to the function cosat by quadratic polynomials on the interval 
(0, 1] with respect to the inner product (f,g) = ite f(t)g(t)dt. 


Solution. Since 


1 4V/3 
(cos nt, 1) = 0, (cost 2v3 (: = 5)) = ave 
Ts 


il 
(cost 6V5 ( —t+ =)) = (0), 
we have 


: AV/3 il 24 8612 
PYOjgpan{1,t,t2} (COS i er 2/3 (« a 5) es 7 wiGeary: 


72 


and 


Example 3.2.40. Consider the vector space P,,(R) with the inner product 


a= / FO gl ae 


We apply the Gram-Schmidt orthogonalization process to the polynomials 
1,t,...,¢” and get polynomials 1,1,...,Dm. Show that 


Dee Span (yee 


3.2. ORTHOGONAL PROJECTIONS 153 


Solution. First we find that 


i “(= Bryer tat = (= EVOL 


-1 


Seo) | aa en 


5 a 7 1) [a = FeO CMe SO 


If we continue to integrate by parts, we end up with 
1 1 
/ ye a) et wf (A —#)")'dt =0: 
5 —1 
In a similar way we get 


[ a-a mea =o 


1 
for every j € {0,...,m—1}. Now, since 
Pm—1(R) ® P-1(R) =Pm (R), 


dim Pn (R) =m +1, and dimP_1(R) = m, we have dimP4_,(R) = 1. This 
gives us our result because ((1 — t?)™)(™) € P+_,(R). Oo 


From Corollary 3.2.37 it follows that every finite dimensional subspace of an 
inner product space has an orthonormal spanning set. In other words, every 
finite dimensional subspace of an inner product space has an orthonormal basis. 


Theorem 3.2.41. Let {x1,...,Xn} be an orthonormal set in an inner 
product space V. The following conditions are equivalent: 


Proof. Assume {x,...,Xn} is a basis in V. If v € (Span{x;,...,x,})+, then 


V=aQa,X1 +++ + AnXn 


154 Chapter 3: Inner Product Spaces 


for some a1,...,Q, € K and (v,x;) = 0 for every 7 = 1,...,n. Since the set 
{xi,...,Xn} is orthonormal, we have 
0 = (v,x;) = (a1X1,..-, @nXn, X;) 


= (1X1, Xj) + +++ + (AnXn, Xj) = 05 (Xj, Xj) = Oy, 


for every j = 1,...,n, which means that v = 0. This shows that (a) implies 


(b). 


Now we observe that 
(v= ((v, x1) $+ + (V,%n)%n), xy) =0 
for every v € V and every 7 =1,...,n, and thus 
v — ((v,x1)x1 +++» + (v,Xn)Xn) € (Span{x1,...,xn})t. 
Consequently, if (Span{x1,...,xn})+ = {0}, then 
V = (Vv, X1)X1 +++ + (V,Xn)Xn, 


for every v € V. This shows that (b) implies (c). 
Since (c) clearly implies (a), the conditions (a), (b), and (c) are equivalent. 
Now assume that (c) holds and consider arbitrary v,w € V. Then 


(v, w) = 


| 
* a. 
iM: 
= 
‘al 
cS. 
‘al 
S. 
Me 
& 
* 
= 
tal 
cow 
See 


I 
< 
ay 
& 

% 
= 
& 

x 
= 


Thus (c) implies (d). 

To see that (d) implies (e) it suffices to let w = v in (d). 

To complete the proof we show that (e) implies (b). Indeed, if (e) holds and 
v € (Span{x1,...,xXn})+, then 


lIvil? = lv.) 1? +--+ + (v,Xn) 1? = 0, 


and thus v = 0. oO 


3.3 The adjoint of a linear transformation 


For any Vo in an inner product space VY the function f(x) = (x, vo) is a linear 
transformation from V to K, which is an immediate consequence of the definition 


3.3. THE ADJOINT OF A LINEAR TRANSFORMATION 155 


of the inner product. It turns out that, if V is a finite dimensional inner product 
space, then every linear transformation from VY to K is of such form. 


Theorem 3.3.1 (Representation Theorem). Let V be a finite dimen- 
sional inner product spaces and let f : V > K be a linear transformation. 
Then there exists a unique vz € V such that 


f(x) = (vp) 


for everyx€ VY. 


Proof. Let f : ¥V > K be a linear transformation. If f is the zero transformation, 
then clearly v = 0 has the desired property. 

Assume f : V > K is a nonzero linear transformation. Then ker f 4 V. Let 
u be a unit vector in (ker f)+. Since f(u)x — f(x)u € ker f for every x € V, we 
have 


(f(u)x — f(x)u, u) = 0, 
which gives us 


(f(u)x, u) = (f(x)u,u) = f(x)|lull? = f(x). 


Consequently 


If we take vy = f(uj)u, then 
F(x) = (x, v¥) 


for every x € VY. 
Now suppose w is another vector such that f(x) = (x, w) for every x € V. 
But then 


|v — wll? = (vy — w, ve — w) = (vy — wy vf) — (ve — Ww, w) 
= f(vy —w) — f(vs —w) =0 


and thus w = vy. | 


From Theorem 3.3.1 we obtain the following important result. 


Theorem 3.3.2. Let V and W be finite dimensional inner product 
spaces. For every linear transformation f : ¥V + W there is a unique 
linear transformation g: W— V such that 


(f(v), w) = (v,g(w)) 


for everyv €V andwe W. 


156 Chapter 3: Inner Product Spaces 


Proof. Let f : ¥V + W bea linear transformation. For every w € W the function 
fw: VY — K defined by fw(v) = (f(v), w) is linear and thus, by Theorem 3.3.1, 
there is a unique vector Zw such that fy(v) = (v,zw) for every v € V. Clearly, 
the function fy depends on w and thus zy depends on w. In other words, there 
is a function g: W > V such that 


(f(v), W) = (Vv, Zw) = (v, g(w)) 


for every v € V and w € W. We need to show that g is linear. 
If w1,w2 € W, then 


(f(v), wi + We) = (f(v), wi) + (f(v), w2) = (v, 9(wi)) + (v, g(wa)) 
= (v, g(wi) + g(wa)) 


for every v € V. By the uniqueness part of Theorem 3.3.1 we have 
g(wi + w2) = g(wi) + g(wa). 
Similarly, if a € K and w € W, then 
(F(v), aw) = a(f(v), w) = av, g(w)) = (v, ag(w)) 


for every v € VY, which gives us 


g(aw) = ag(w). 


Definition 3.3.3. Let V and W be finite dimensional inner product 
spaces and let f : V > W be a linear transformation. The unique linear 
transformation g : W— V such that 


(f(v), w) = (v, g(w)) 


for every v € V and w € W is called the adjoint of f and is denoted by 
f*. 


Note that, if g = f*, then also f = g*. Indeed, if (f(v), w) = (v,g(w)) for 
every v € V and w € W, then 


(9(w), v) = (v, g(w)) = (fv), w) = (w, f(v)) 


for every v € V and we W. 

Theorem 3.3.2 says that for any finite dimensional inner product spaces V 
and W, if f € L(V,W), then f* € LOW, V). We can think of * as an operation 
from L(V,W) to L(W,YV), that is, *: L(V,W) > L(W,V). The next theorem 
lists some useful algebraic properties of the operation *. 


3.3. THE ADJOINT OF A LINEAR TRANSFORMATION 157 


Theorem 3.3.4. Let V, W, and X be finite dimensional inner product 
spaces. 


for every f € L(V,W) and g € LWW, 4X); 


fit fe)y* =ff+ fy for every fi, fo € LV,W); 
af)* =af* for every f €E L(V,W) anda eK; 


for every f € L(V,W). 


Proof. (a) For any v € VY we have 
(Id(v), v) = (v,v) = (v, Id(v)). 
(b) For any v EV, w € W, and x € ¥, we have 
(9(f(v)),*) = (F(v), 9° 0x)) = (vs Fg" ()))- 
(c) For any v € V and w € W, we have 


(fi + fa)(v),w) = (filv), w) + (fa(v), w) = (v, fi (w)) + (v, fo (w)) 
= (v, (ft + f2)(w)). 


(d) For any v€ V, w € W, and a € K, we have 
((af)(v),w) = (af(v), w) = a(f(v),w) = av, f*(w)) = (v,af*(w)). 
(e) For any v € V and w € W, we have 
(f(v), w) = (v, f*(w)) = ((f")"(v), w). 
O 


Example 3.3.5. Let f : C? — C3 and g: C? > C® be the linear transforma- 
tions defined by 


(eq) Em ls) Eh 


Show that f* = g. 


Solution. We have 


(DDE) >= 


158 Chapter 3: Inner Product Spaces 


Ty r2 Ly 0 
yl og Y2 ) = Yi} , | v2 ) = yi 02 + 2142. 
ZA 22 21 Y2 


Example 3.3.6. In this example we give an application of the adjoint of a 
linear operator. 

Let V be a finite dimensional inner product space and let f : V > V 
be a linear operator. Use the adjoint of f to show that, if {aj,...,a,} and 
{b1,...,b,} are two orthonormal bases in V, then 


Y Wa) I? = YW II? 


Solution. From Theorem 3.2.41 we get 


IIf(as)IF = Dos 


for every 7 = 1,...,n and 


I f* (bx) ||? = ir = Sart (br): 


j=l 
for every k = 1,...,n. Hence 
yi (aj)||? = DE 
j— Wk 
= SOF May, f(be))? = SOI (be) 7. 
j=l k=l k=1 


In similar way we obtain 


dol F(bs)I? = Sir (bx)II*, 
j=l 


which gives us the desired result. O 


3.3. THE ADJOINT OF A LINEAR TRANSFORMATION 159 


Theorem 3.3.7. Let V and W be finite dimensional inner product 
spaces. A linear transformation f : V > W is invertible if and only 
if f* :W— V is invertible and then we have 


Proof. From 
f-'f =Idy 
we get 
FF) = (FTF) = Id}, = Idy 
and from 
ff-* =Idw 
we get 


(f-') f= (fF) = Idyy = Id. 
O 


In Theorem 2.3.2 we prove that, if B = {v1,...,vm} and C = {wi,...,wn} 
are bases of vector spaces V and W, respectively, then for every linear transfor- 
mation f : V > W there is a unique n x m matrix A such that f(v) = Av for 
all v € V. We say that A is the matrix of f relative to the bases B and C and 
write A = fpsc. 

In the following theorem we use A* to denote the conjugate transpose of 
A. If A = [ag;] is an n X m matrix with complex entries, then the conjugate 
transpose of A is the m x n matrix defined by A* = [Gz]. 


Theorem 3.3.8. Let V and W be finite dimensional inner spaces and 
let f : V > W be a linear transformation. If B = {vi,...,Vm} is an 
orthonormal basis of V and C = {wi,...,Wn} an orthonormal basis of 
W, then 


(f*)e+B = (fac) : 


Proof. For all j € {1,...,m}andk € {1,...,n} we let ax; = (f(v;), we). Then 
f (v3) = (Ff (vs), wi)wi +++ + (fF (v9), Wn) Wn = @1yW1 +++ + Ongwn, 
for every j € {1,...,m}. This means that the matrix A = [a;,] is the matrix 
of f relative to the bases B and C. 
On the other hand, for every k € {1,...,n}, we have 
F°(we) = (f° (we)s vidva +-+> + (fF (we), Vn) Vm 
= (we, f(v1))va + +++ + (We, f(Vm)) Vm 
= (f (vi), wr)vi + +++ + (fF (Vm), Wk) Vm 


= Ge1V1 +++ + akmVm; 


160 Chapter 3: Inner Product Spaces 


which means that the matrix of f* relative to the bases C and B is the conjugate 
transpose of the matrix of f relative to the bases B and C. O 


The adjoint of a linear transformation f € L(V, W) is a linear transformation 
f* € LIW,YV). If V = W, then f, f* € L(V,V) = L(V) and we can consider 
properties of the adjoint operation that simply don’t make sense when VY 4 W. 


Definition 3.3.9. Let V be a finite dimensional inner product space 
and let f : ¥V— V be a linear operator. 


(a) If ff* = f*f, then f is called a normal operator. 


(b) If f* = f, then f is called a self-adjoint operator. 


Clearly, every self-adjoint operator is normal. 

Note that self-adjoint operators can be defined for all inner product spaces 
(not necessarily finite dimensional): a linear operator f : V > V is self-adjoint 
if (f(x),y) = (x, f(y)) for every x,y € V. More on operators on infinite 
dimensional inner product spaces can be found in Section 5. 


Example 3.3.10. Consider the operator f € £(C?, C7) defined by f(x) = Ax 
where 
a 1-1 
a E -i i 
Show that f is normal but not self-adjoint. 
Solution. Since 
= || = Silare 
an I, +i a 


f is not self-adjoint. On the other hand, since 


ee epee 
ae =#A=( 545 ‘le 


f is a normal operator. O 


Theorem 3.3.11. Let V be a finite dimensional inner product space. 
For every linear operator f : V — Y, the operators f f*, f* f and f + f* 


are self-adjoint. 


3.3. THE ADJOINT OF A LINEAR TRANSFORMATION 161 


Proof. Since 
i =e Sn 
ff* is self-adjoint. In the same way we can show that f* f is self-adjoint. 
Since 


ft pyaefty y= +z, 
f+ f* is self-adjoint. 7 


The composition of two self-adjoint operators need not be self adjoint. The 
following theorem tells us exactly when it is rue. 


Theorem 3.3.12. Let f and g be self-adjoint operators on a finite di- 


mensional inner product space V. The operator fg is self-adjoint if and 
only if fg = gf. 


Proof. If f and g are self-adjoint, then for every v, w € V we have 


(f9(v),w) = (g(v), f(w)) = (v, gf(w)). 


Consequently, fg = gf if and only if fg is self-adjoint. O 
The following useful result is a consequence of the polarization identity. 
Theorem 3.3.13. Let V be a finite dimensional inner product space. 


(a) If f: VV is a self-adjoint operator such that (f(v),v) = 0 for 
every v € VY, then f =0. 


(b) If fi, fo: V—- V are self-adjoint operators such that (fi(v),v) = 
(fo(v),v) for every v € V, then fi = fa. 


Proof. Let f : ¥V > V be a self-adjoint operator. First we note that the form 
s(v,w) = (f(v), w) is sesquilinear. We show that s is hermitian. Indeed, 


s(v, w) = (f(v), w) = (v, f*(w)) = (v, f(w)) = (f(w), v) = s(w, v). 


Now, if (f(v), v) = 0 for every v € VY, then (f(v), w) = 0 for every v, w € V, by 
Theorem 3.1.11 (or Theorem 3.1.6 for a real inner product space). Consequently, 
f =0, proving part (a). 

To prove part (b) we take f = f; — fg and use part (a). oO 


162 Chapter 3: Inner Product Spaces 


Example 3.3.14. The above theorem does not hold if we drop the assumption 
that f is self-adjoint. For example, consider the operator f € L(C?, C?) defined 


7) ace 


for every i eC*, but 7 40. 


Theorem 3.3.15. Let f : V > V be a self-adjoint operator on a finite 


dimensional inner product space V. If f* = 0 for some integer k > 1, 
then f = 0. 


Proof. If f? = 0, then for every v € V we have 


0= (f*(v),v) = (F(¥), FY) = IIF@)IP, 


and thus f = 0. If f* = (f?)? = 0, then f? = 0 and thus f = 0. This way we 
can show that if f?” = 0 and then f = 0. For any other integer k > 1 we find 
an integer n > 1 such that k < 2”. Then, if f* = 0, then f?” = 0 and thus 
f =0. Oo 


The above property may seem obvious, but it’s not true for arbitrary linear 
operators. For example, for the operator f : C? + C? defined as f(x) = Ax, 


where A = i al we have f? = 0, but f 40. 


We close this section with an important characterization of normal operators. 


Theorem 3.3.16. Let V be a finite dimensional inner product space. A 
linear operator f :V— V is normal if and only if \|f(v)|| = ||f*(v)|| for 


every VE V. 


Proof. Assume f is normal. Then for every v € V we have 

If(vll? = Fy), £0) = PF), v) = FFCV), v) = FW), FY) = IPI. 
Now assume || f(v)|| = ||,f*(v)|| for every v € V. Since 

(f° f(v), v) = (F(v), Fv) = IFIP = Fl? = ), FP) = FFF), 9), 


for every v € V, we have f f* = f* f by Theorems 3.3.11 and 3.3.13. 
oO 


3.4. SPECTRAL THEOREMS 163 


3.4 Spectral theorems 


Spectral decomposition of matrices is one the most important ideas in matrix 
linear algebra. Here we generalize this idea to operators on arbitrary finite 
dimensional inner product spaces. 


3.4.1 Spectral theorems for operators on complex inner 
product spaces 


In this section all inner product spaces are assumed to be complex. 


Definition 3.4.1. Let V be a complex vector space and let f:V—7 V 
be a linear operator. 


(a) A € C is called an eigenvalue of f if f(v) = Av for some nonzero 


vey. 


(b) If A € C is an eigenvalue of f, then every nonzero vector v € VY 
such that f(v) = Av is called an eigenvector of f corresponding to 
dr. 


The set of all eigenvectors of a linear operator f corresponding to an eigen- 
value \ is not a vector subspace of V because the zero vector is not an eigenvec- 
tor, but if we include the zero vector, then we obtain a subspace that is called 
the eigenspace of f corresponding to A and is denoted by €). 


Example 3.4.2. Consider the complex vector space V = Spanfe’,e*,..., 
e™\, Show that 1,2,...,n are eigenvalues of the differential operator 4 on 


dt 
the space V. 
Solution. For every k € {1,2,...,n} we have 


d 
orca = ke™. 


This means that k is an eigenvalue of the differential operator 4 and the 
function e is an eigenvector corresponding to k. O 


Example 3.4.3. Consider the real vector space V = Span{1,t,t?,...,¢”}. 


Show that 0 is the only eigenvalue of the differential operator 4 on the space 
Ve 


164 Chapter 3: Inner Product Spaces 


Solution. Since £1 = 0 = 0-1, the number 0 is an eigenvalue of the differential 


operator £ and the constant function 1 is an eigenvector corresponding to 0. 


Now suppose there is a \ # 0 that is an eigenvalue of £ on Y. Then 


d 
aH (20 hee ieee = (Zo + 24t+ sso b apt ) 
for some k < n and some Zo, 21,..-,2% € R such that z, 4 0. But this means 
that 
2y + 2Qzot+---+ kzpt*—! = A(zot zit+---+ zee), 


which implies Az, = 0, a contradiction. O 


Example 3.4.4. This example uses derivatives of the functions of the form 
F:R-C. 

Consider the complex vector space VY = Span{cost, sint}. Show that i and 
—i are eigenvalues of the differential operator 4. 


Solution. Since 


a (cost +isint) =—sint +icost = i(cost +isint), 


zi is an eigenvalue of the differential operator £ and the function cost + 7sint 
is an eigenvector corresponding to 2. Similarly, since 


a (cost —isint) = —sint —icost = —i(cost —isint), 
—7 is an eigenvalue of the differential operator £ and the function cost —7isint 
is an eigenvector corresponding to —12. O 


In the next example we are assuming that f : V > V is an operator such 
that there is an orthonormal basis {e1,...,e,,} of V consisting of eigenvectors of 
f. As we will see later in this section, every normal operator has this property. 


Example 3.4.5. Let V be a finite dimensional inner product space and let f : 
Y > V bea linear operator such that there is an orthonormal basis {e1, ..., en} 
of V consisting of eigenvectors of f. Show that for every v € V we have 


f(v) = ys Aj (V, €5)€;, 


3.4. SPECTRAL THEOREMS 165 


where A; is the eigenvalue corresponding to the eigenvector v;. 


Solution. Since v = >" 


j-1(¥,e;)e; for every v € V, we have 


The following theorem gives us two useful descriptions of eigenvalues. 


Theorem 3.4.6. Let V be a finite dimensional complex vector space 
and let f : V > V be a linear operator. The following conditions are 
equivalent: 


(a) X€C is an eigenvalue of f; 
(b) ker(f — Ald) 4 {0}; 
(c) The operator f — AId is not invertible. 


Proof. Equivalence (a) and (b) is an immediate consequence of the definitions. 
Equivalence (b) and (c) follows from Theorem 2.1.23. O 


Definition 3.4.7. If f : V — Y is a linear operator on a vector space V 
and p(z) = a9 +412 +--+: +@mz™ is a polynomial, we define 


v(f) =agld+a,f+---+amnf™. 


Since compositions and linear combination of linear operators are linear op- 
erators, p(f) is a linear operator. Clearly, if p and q are polynomials, then 


(p + a)(f) = p(f) + a(f) and (pq)(f) = p(f)a(f). 


Theorem 3.4.8. If V is a nontrivial compler vector space of finite di- 


mension, then every linear operator f : V > V has an eigenvalue. 


Proof. Let dimV = n. Since dim L(V) = n?, the operators Id, f, f?,..., f™ are 
linearly dependent and thus 


agld+aif 4 a2 f? Pea an f* =0 


166 Chapter 3: Inner Product Spaces 


for some k < n? and ag,a1,42,...,a, € C such that a, 4 0. Now, by the 
Fundamental Theorem of Algebra, there are complex numbers 21,..., 2% such 
that 


do tart +--+ + agt” = ag(t — 21) +++ (£— Ze). 
Consequently, 
ao Id+aif + a2f? +--+ +axf* = ax(f — 21 1d)--+(f — 2 Id). 


Since the operator (f — z1Id)---(f — z,Id) is not invertible, for at least one 
j € {1,...,k} the operator f — z; Id is not invertible, which means that z; is an 
eigenvalue of f, as noted in Theorem 3.4.6. O 


Theorem 3.4.9. Let V be a finite dimensional inner product space and 
let f : V— V be a normal operator. If X is an eigenvalue of f, then 


is an eigenvalue of f*. Moreover, every eigenvector of f corresponding 
A is an eigenvector of f* corresponding X. 


Proof. First we note that, if f is a normal operator, then f — AId is a normal 
operator and we have 


(f —Ald)* = f* — Ald. 


Let v be an eigenvector of f corresponding to A. Then, by Theorem 3.3.16, we 
have 
0 = ||f(v) — Av] = IIF*(v) — AvI| 


and consequently 
f*(v) = dv. 


Theorem 3.4.10. Let V be a finite dimensional inner product space 


inner product space and let f : V > V be anormal operator. Eigenvectors 
of f corresponding to different eigenvalues are orthogonal. 


Proof. We need to show that if A and ps are two distinct eigenvalues of f and v 
and w eigenvectors of f corresponding to \ and p, respectively, then (v, w) = 0. 
Indeed, since 


Av, w) = (f(v),w) = (v, fw) = (v, iw) = pv, w), 


we have (A — )(v, w) = 0. Consequently, (v,w) = 0, because \ 4 p. oO 


3.4. SPECTRAL THEOREMS 167 


Note that the property in the above theorem can be expressed as follows: 
Eigenspaces of a normal operator corresponding to different eigenvalues are mu- 
tually orthogonal subspaces. 


Definition 3.4.11. A subspace U of a vector space V is called an in- 
variant space of a linear operator f : V > VY, or simply f-invariant, if 


f(@) CU. A subspace U of an inner product space V is called a reducing 
space of a linear operator f : V > V, if both U and U+ are f-invariant. 


Example 3.4.12. Let f : R? — R® be the linear operator defined by 


ae ya) |) |e 
if y =|0 8 O] ly 
z 1 —2 3] |z 
1 0 
Show that Span ¢ |0] , |0 is f-invariant. 
0 1 
Proof. We have 
1 2 1 0 
f | {0 = |0} €Span< |0} , |0 
0 1 0 il 
and 
0 G 1 0 
fi} jo = |0] €Span<¢ |0] , | 0 
il 3 0 il 


oO 


The following two theorems characterize invariant spaces and reducing spaces 
in terms of projections. 


Theorem 3.4.13. Let V be an inner product space and let f : V7 V 
be a linear operator. A finite dimensional subspace U C VY is f-invariant 
if and only if 


fprojy = projyf proj. 


Proof. If U C ¥ is f-invariant, then proj,(v) € U and f(proj,(v)) € UY for 


168 Chapter 3: Inner Product Spaces 


every v € V. Consequently 
projy(f(projy(v))) = f(projy(v)), 


that is, fprojy = proj f proj. 
On the other hand, if fproj,, = proj f proj, then 


f(u) = f(projy(u)) = projy(f(projy(u))) €U 


for every u € U, which means that U is f-invariant. O 


Theorem 3.4.14. Let V be an inner product space and let f : V7 V 
be a linear operator. A finite dimensional subspace U C V is a reducing 
subspace of f if and only if 


fprojy = projy f. 


Proof. By Theorem 3.4.13, the subspace U C VY is a reducing subspace of f if 
and only if 


fprojy = proj, fproj, and  f(Id—proj,) = (Id—proj,) f(Id —proj,,), 
because proj. = Id—proj,. The above is equivalent to 
fprojy = proj, fproj, and proj, f = proj, fprojy, 


or simply to 
fprojy = projy f, 


because the equality fproj,, = proj, f implies 


projy fprojy = projyprojy f = projy f. 


Theorem 3.4.15. Let V be a finite dimensional inner product space and 


let f : V — V be a linear operator. If a subspace U C VY is an invariant 
subspace of f, then U+ is an invariant subspace of f*. 


Proof. Assume U C Y is an invariant subspace of f. If u€ U and v € U+, then 


(f*(v),u) = (v, f(u)) = 0, 


because f(u) € U. Consequently, f*(v) € U+. Oo 


3.4. SPECTRAL THEOREMS 169 


Theorem 3.4.16. Let V be a finite dimensional inner product space and 
let f: V— V be a nonzero linear operator. The following conditions are 
equivalent: 


(a) f is a normal operator; 


(b) There are orthonormal vectors e;,...,e, € V and nonzero complex 
numbers 1,...,Ay such that for every v € V we have 


= >> Ajlv, es)e53 
j=l 


(c) There are orthonormal vectors e;,...,e, € V and nonzero complex 
numbers \1,...,Ar such that 


f=) Approj,: 


j=1 


Proof. First we note that (b) and (c) are equivalent because proj, (v) = (v, e;)e; 
for every v € V. 

If f(v) = Soi-1 Aj(v,e;)e; for every v € V, then f(e;) = Aje; for every 
j =1,...,r and hence 


“DM (f*v,e;)e; = mM (v, f(e;))e 
“DM (v, Aje;)e; = LAK (v,e;)e; = Lintoere 


On the other hand, since 


170 Chapter 3: Inner Product Spaces 


and thus f*(e;) = A;e,; for every j = 1,...,r. Consequently 


3 


a=) j=l 
= SAG (v, Ayes) ej = SOAS (v,e;)e; = 2 |Aj|"(v, €7)e; 
=] j=l j=l 


This shows that, if f(v) = pee Aj(v,e;)e; for every v € V, then f is a normal 
operator, that is, (b) implies (a). 

To complete the proof we show that (a) implies (c). Assume f is a nonzero 
normal operator. By Theorem 3.4.8, f has an eigenvalue A. Let e be a unit 
eigenvector of f corresponding to A. Then, by Theorem 3.4.9, \ is an eigenvalue 
of f* with the same eigenvector e and thus the subspace Span{e} is f-invariant 
and f*-invariant. Consequently, by Theorem 3.4.15, Span{e}+ is f-invariant 
and f*-invariant. 

Now we argue by induction on the dimension of the range of f. 

If dimran f = 1, then clearly f = Aproj, and we are done. 

Now we assume that dimran f = r for some r > 1 and that the implication 
(a) implies (c) is proved for every normal operator g such that dimrang = q <r. 

We denote \ = A, and e = e,. Because, as observed above, the subspaces 
Span{e,} and Span{e,}+ are f-invariant and f*-invariant, the operators f and 
f* commute with the projection on Span{e,} and we have, by Theorem 3.4.14, 


fproj,, = proj,, f = Arproj,, and f*proj., = proj,, f* = A,proj,,. 
Consequently, 
(f—Arproje,.)(f* —rPIOje, ) = ff*—-|Ar|?proje, = (f*—Arproje, )(f—ArPrOje, ); 


which means that f — A;proj,, is a normal operator. Moreover, since ran f = 
Span{e,} @ f(Spanfe,}+), we have 


dim ran(f — A,proje,) = dim f(Span{e,}+) =r—1. 


By our inductive assumption there are orthonormal vectors e;,...,¢@,-—1 and 
nonzero complex numbers Aj,...,A;-—1 such that 


r—-1 
f — ArProje, = so AZPLOJe,; 
j=l 


which gives us the desired representation f = S- AZPTO]e, + O 
j=l 


Theorem 3.4.17. Let V be a finite dimensional inner product space. 


The operator f : V + V is normal if and only if there is an orthonormal 
basis of V consisting of eigenvectors of f. 


3.4. SPECTRAL THEOREMS 171 


Proof. Let dim VY = n. The orthonormal vectors e;,...,e, in Theorem 3.4.16 
are eigenvectors of f and we have ran f = Span{ej,...,e,}. Since V = ran f @ 
(ran f)+, we have dim(ran f)+ = n —r and there are orthonormal vectors 
€r41,-+-,@n Such that (ran f)+ = Spanfe,+1,...,en}. 


If v € (ran f)+, then (v, f(f*(v))) = 0. Since f is normal, we have 
f(f*(v)) = f*(f(v)) and thus 


0= (v, F(F*(¥))) = ve F(F(v))) = NF ODI?: 


Hence f(v) = O and thus e,41,...,e, € kerf, which means that they are 
eigenvectors corresponding to the eigenvalue 0. 

By the Rank-Nullity Theorem we have dimker f = n — r and thus the set 
{e,+41,-.-,@n} is a basis of ker f. Consequently, {e1,...,e,} is an orthonormal 
basis of V consisting of eigenvectors of f. 

On the other hand, if there is an orthonormal basis {e,...,e,} of V con- 
sisting of eigenvectors of f, then the operator f is normal by Example 3.4.5 and 
Theorem 3.4.16. O 


Example 3.4.18. Let V be a finite dimensional inner product space and let 
f :V— Y be a normal operator such that for every v € V we have 


v) = Do Ailv, es )ey, 
Feil 


where {e),...,¢e,,} is an orthonormal basis of Y consisting of eigenvectors of f 
and A1,...,An are the corresponding eigenvalues. Show that p(f) is a normal 
operator and we have 


= D0. )(v,e;ye 


for any polynomial p. 


Solution. Since \1,...,An are eigenvalues of f corresponding to eigenvectors 
€1,...,@n, we have f*(e;) = Ne; for every 7 € {1,...,n} and every integer 
ie My 


Consequently, p(f)(e;) = p(A;)e; for every 7 € {1,...,n} and consequently 
the linear operator p(f) is normal according to Theorem 3.4.17. From Example 


3.4.5 we get 
= Lpayivete 


172 Chapter 3: Inner Product Spaces 


A representation of a linear operator f : Y > Y in the form 
t i 
f = >. AGPTO]e,; 
j=l 


as in Theorem 3.4.16, is called a spectral decomposition of f. 


Example 3.4.19. In Example 3.3.10 we show that the operator f € £(C?, C?) 
defined by f(x) = Ax where 


a =i 
a= |, | 


is normal. Find a spectral decomposition of f. 


Solution. First we find eigenvalues of f. We need to find values of \ € C such 
that the operator f — AId is not invertible. Since 


gatas = [1 aa] * 


and 


SCS OS a Nel = =O Sey 


act | #0 a | 


—1-i% 2%-d 


f has two eigenvalues: 0 and 33. 
Next we need to find an eigenvector corresponding to 37, that is a nonzero 


vector =] € C? such that 
2 


1 : : ae f 
4 | satisfies the above equation, it is an eigenvector 
corresponding to the eigenvalue 37 and 


Since the vector | 


f = 3iproj l 
, + | 


is a spectral decomposition of f. 


3.4. SPECTRAL THEOREMS 173 


The vector | sy | is an eigenvector corresponding to 0, but it is not used in 


the spectral decomposition of f. Note that, as expected, the vectors ls et | 
and F ‘ j are orthogonal. O 


Example 3.4.20. Show that the operator f € £(C?,C?) defined by f(x) = 
Ax, where 


342i 2-4: 
Cavern 
is normal and find a spectral decomposition of f. 
Solution. Since 


3+21-rA 2-4 


oo eg 


= (3 +21 —d)(6+4—A) — (4421)(2 — 44) 


= 16+ 151+ d7 — (9 + 32)A + 16 — 12% 
= \? — (9+ 3i)A + 27% 
= (9 — A)(32 — A) 

the eigenvalues of the operator f are 9 and 31. 


4i—-2]|. : : 
: | is an eigenvector corresponding to 9 and 


Next we find that eee 9 


4i—2] . : : : _ | 44-2 
| 3 4 is an eigenvector corresponding to 32. The vectors v1 = E 6+ 5 
Ai — 


and v2 = | 3 ‘| are orthogonal and thus {vj, v2} is an orthogonal basis of 


L(C?,C?) and consequently the operator f is normal. Thus 
f = 9proj,,, + 3iproj,,, 


is a spectral decomposition of f. 


174 Chapter 3: Inner Product Spaces 


For practical calculations we can use matrices of the projections proj,, and 
proj,,, which gives us 


(El) -o(gal i) Gates a) 
+s (Galt) Gatos) 
=3l is el a wal Ee a =] 


for every 7] EC’. O 
2 


As an immediate consequence of Theorem 3.4.16 we obtain the following 
result, which is also referred to as the spectral representation. 


Theorem 3.4.21. Let V be a finite dimensional inner product space and 
let f: V— V be a normal operator. Then 


q 
oe AjProje, 


j=l 


where r1,...,Aq are all distinct eigenvalues of f. 


Let V be an inner product space and let U,,...,U, be subspaces of V such 
that 


YV=HU,S--- BU,. 


If the subspaces U),...,U4, are mutually orthogonal, that is, (x,y) = 0 for any 
x EU; and y CU with j # k, then we say that U, ®--- @U, is an orthogonal 
decomposition of the space VY. 

From Theorem 3.4.21 we obtain the following important result. 


Theorem 3.4.22. Let f : V + V be a normal operator on a finite dimen- 
sional inner product space V. If X1,...,Xq are all distinct eigenvalues of 


f, then Ey, @-+» PBEy, is an orthogonal decomposition of V. 


The decomposition of a normal operator in Theorem 3.4.21 is unique as 
shown in the next theorem. 


3.4. SPECTRAL THEOREMS 175 


Theorem 3.4.23. Let V be a finite-dimensional inner product space 
and let f: ¥V > V be a linear operator. IfU, @--- OU, is an orthogonal 
decomposition of V and 


f=) Aiproiu, 


j=1 


jor some distinct .Aj,.2.,de-€ WK, then, =), for every 9-e 11, «215 4). 


Solution. If v €U;, then f(v) = A;v and thus v € €),. 
Now, if v € €),, then f(v) = Agv = 05-4 Anprojy, (v) because )7_, projy, 
= Idy. Since 


F(v) = 0 Ajproiy, (v), 
j=l 


we have : 
> Aeproiu, (v) = J Aprojy, (v) 
j=l j=l 


which gives us 
Tr 


So [Ae —AyP llproiy, (v)||? = 0. 
j=lj#k 
Consequently, projzy, (v) = 0 for 7 A k. This means that v = projy, (v) and 
thus v € Uz. O 


Now we turn our attention to spectral properties of self-adjoint operators. 


Theorem 3.4.24. Let V be a complex inner product space. All eigen- 


values of a self-adjoint operator f : V + V are real numbers. 


Proof. Let be an eigenvalue of a self-adjoint operator f:V— V. If x is an 
eigenvector of f corresponding to A, then 


M(x, x) = (AX, x) = (F(x), x) = (x, f(&)) = (x, AX) = A(x, x). 
Since (x,x) = ||x||? 40, we have \ = X. Oo 


The above property characterizes normal operators that are self-adjoint. 


Theorem 3.4.25. Let V be a finite dimensional complex inner product 


space. A normal operator f : V > V is self-adjoint if and only if all 
eigenvalues of f are real. 


176 Chapter 3: Inner Product Spaces 


Proof. In the proof of Theorem 3.4.16 we have shown that if 


then 


Our result is a consequence of these equalities. 


Example 3.4.26. Find a spectral decomposition of the self-adjoint operator 
f € L(C?, C7) defined as 


wiJ\ _[ 338 24-248] [a 
F\ lool) = loa+oa 57 || |o0|° 


33 = t= aay 
det eee aed = (33 — A)(57 — A) — (24 + 24%) (24 — 242) 


= (244+ 9— d)(48+9-—) — 24-48 
= (9 — A)(81 — d), 


Solution. Since 


the eigenvalues of f are 9 and 81. The vectors | and f =. / are eigenvec- 


tors of f corresponding to the eigenvalues 9 and 81, respectively. Consequently, 


f = 9proj 1-3 + 81proj en 
—1 2 


is a the spectral decomposition of f. 


Since the matrix of the projection on Span { a \ is 


(al) aoa) 3 


and the matrix of the projection on Span { F - | \ is 


(zal) (Gabrea)=ahae "2 


3.4. SPECTRAL THEOREMS 177 


we have 


for every EB) EC’. O 
2 


3.4.2 Self-adjoint operators on real inner product spaces 


Now we turn our attention to real inner product spaces. Some properties estab- 
lished for complex spaces remain true, but there are some essential differences. 


Example 3.4.27. Consider the operator f € £(R?, R?) defined by f(x) = Ax, 
where 
0 -1 
4= (4. 
Show that f is a normal operator without any eigenvalues. 


-1 0 


Pkt d=Lrdbt d=): 


Solution. First we note that f*(x) = A?x, where A? = | 4 i Since 


f is a normal operator. To show that f has no eigenvalues we note that 


-rA -1| _ \2 
det | | ales +1 


and the equation A? + 1 = 0 has no real solutions. (A complex number cannot 
be an eigenvalue of an operator on a real vector space.) O 


A matrix A € Mnxn(R) is called symmetric if AT = A. Note that if A € 


Mnxn(R) is symmetric, then the operator f : R” > R” defined by f(x) = Ax 
is self-adjoint. 


Lemma 3.4.28. If A © Mn xn(R) is symmetric, then the operator f : 


R” > R” defined by f(x) = Ax has an eigenvalue. 


178 Chapter 3: Inner Product Spaces 


Proof. Let g : C” —» C” be the linear operator defined by g(x) = Ax. By 
Theorem 3.4.8, g has an eigenvalue X. Since A is a symmetric matrix, A is a 
real number. Let z € C” be an eigenvector corresponding to A. We can write 
z= x+y where x,y € R”. Then 


Az = A(x + ity) = A(x 4+ ty) = Ax + Ay. 


Since A(x + iy) = Ax +7Ay and A is a real number, it follows that A(x) = Ax 
and A(y) = Ay. Now, because z 4 0, we have x 4 0 or y 4 0. Consequently, 
A is an eigenvalue of f. Oo 


Theorem 3.4.29. Let V be a finite dimensional real inner product space. 


Every self-adjoint operator f :V—> V has an eigenvalue. 


Proof. Let B = {v1,...,Vn} be a basis of V and let A be the B-matrix of f. 
Since f is self-adjoint, A is a symmetric matrix. Let \ € R be an eigenvalue of A 


eal 
and let x = | : | € R” be an eigenvector of A corresponding to the eigenvalue 
In 
A. Then 
LY Ly AL 
A =X = 
Ln Ln AL, 
and thus 


f(@ivi +--+ + anVn) = (Ati)vi te) + Atn)vn = A(t1v1 +++: + 2nVn), 


which means that the real number \ is an eigenvalue of f and 41v1 +---+2nVn 
is an eigenvector of f corresponding to 1. oO 


The following theorem is a version of Theorem 3.4.16 for self-adjoint opera- 
tors on real inner product spaces. The proof is similar to the proof of Theorem 
3.4.16. Note that the above theorem is needed for the result. 


3.4. SPECTRAL THEOREMS 179 


Theorem 3.4.30. Let V be a finite dimensional inner product space 
over R and let f : V > V be a nonzero linear operator. The following 
conditions are equivalent: 


(a) f is a self-adjoint operator; 


(b) There are orthonormal vectors e;,...,e, € VY and nonzero real 
numbers 1,...,Ay such that for every v € V we have 


f(v) = 2 Aj (Vv, 5) e535 


(c) There are orthonormal vectors e),...,e, € V and nonzero real 
numbers \1,...,A- such that 


Example 3.4.31. Find all eigenvalues and a spectral decomposition of the 
operator f € L(R°,R?) defined by f(x) = Ax, where 


412 
A=|151 
214 


Solution. Note that A is a symmetric matrix, so f has at least one real eigen- 
value. 
We have to determine the real numbers 4 such that the system 


(A> jr yt 22=—0 
x+(5—-A)ytz =0 
2a+yt+(4—A)z=0 


has nontrivial solutions. By adding the second and third equations to the first 
one we get 

(7-A)\(e@+y+z)=0 

e+(5-A)jyt+z2 =0. 

2e+y+(4—A)z =0 


If X = 7, then the system has nontrivial solutions. If 4 4 7, then the system 


180 Chapter 3: Inner Product Spaces 


is equivalent to 
t+yt+z —i() 
z+(5-—A)jy+z =0 
22 +y+(4—A)z=0 


or 
LE+yt+z =0 


(4—A)y =. 

2a+yt+(4—A)z=0 
If \ = 4, then the system has nontrivial solutions. If \X 4 4, then y = 0 and 
the system becomes 


or 
eos =O) 
(2—A)z=0 
If \ = 2, then the system has nontrivial solutions. If \ £ 2, then x = y = z = 0. 
AM ee, 
Consequently, the eigenvalues of the matrix }1 5 1] are 7, 4 and 2. 
2A Al 
1 1 il 
The vectors |}1}, }—2], and | 0], are eigenvectors corresponding to the 
1 1 -1 
eigenvalues 7, 4 and 2, respectively. If we let 
1 1 ‘i 
Vi = AR || Vos WG 5) Ve 0 ? 
1 
1 1 -—= 
V3 Te Ye 


then we can write 


f(x) = 7(x, v1)vi + 4(x, v2) v2 + 2(x, v2)ve 


f = 7proj,, + 4proj,, + 2proj,,- 


3.4.3 Unitary operators 


In this section all inner product spaces are assumed to be complex. We discuss 
a special type of normal operators, called unitary operators. These operators 
have interesting geometric and algebraic properties similar to rotations about 
the origin in R”. 


3.4. SPECTRAL THEOREMS 181 


Definition 3.4.32. Let V be a finite dimensional inner product space. 
A linear operator f : VY > V is called a unitary operator if 


ae yak 


where Id is the identity operator on V. 


Note that the condition in the above definition implies that f is invertible, 
may=—V,end 7 = 7". 


Theorem 3.4.33. Let V be a finite dimensional inner product space. If 


f:V—-YV is a unitary operator, then f* and f—! are unitary. 


Proof. If f is unitary, then 
PP) =P f=Id and (f*) f= ff =Id, 


and hence f* is unitary. Since f~! = f*, f+ is unitary. O 


Example 3.4.34. Consider the vector space P,(C) with the inner product 
(f,9) = fo f()g@at. Show that the linear operator ® : P,(C) + Pn(C) 
defined as ®(f)(¢) = f(1 — £) is a unitary operator. 


Solution. Since, for every f,g € Pn(C), we have 


(8(f), 9) = it f(1—t)g@at = | f(g — Dat = (f, (9), 


® is self-adjoint. Hence, 6*@ = 66* = 6? = Id. oO 


Definition 3.4.35. Let V be a normed space. A linear operator f : 
VY > V is called an isometric operator or an isometry if 


II FC) I = [lel 


for every x € V. 


182 Chapter 3: Inner Product Spaces 


Example 3.4.36. Let z and w be two complex numbers such that |z| = |w| = 
1. Find a linear isometry f : C > C such that f(z) = w. 


Solution. The function f(x) = wzx has the desired property. Since for every 
x € C we have 


f(a) = f (=z) = =F @) = Sw = ure, 


this solution is unique. 
This function f can be interpreted as the rotation of the complex plane 
about the origin that takes z to w. Oo 


Example 3.4.37. Let V be a finite dimensional complex inner product space 
and let {e1,...,@,} be an orthogonal basis of V. Show that if A1,...,An are 
complex numbers such that |Ai| = --- = |An| = 1, then the linear operator 
f:V-— YV defined by 


f(arer ae Oar Qn€n) => AY aye} ar aa AnQn€n 
is an isometric operator. 


Solution. Since the vectors e1,...,e, are orthogonal, by the Pythagorean The- 
orem, we get 
| f(aver +--+ + An€n)||? = ||Araver +--+ Aan 
A1a4e4||7 ap os Saag ||AnQnen||? 
= |Ai|?llarer||? +--+ + |An/?llanen||? 


ayey||? +--+ + |lanen||? 


= |lave; +----+ a,,e,||". 


Consequently, || f(x)|| = ||x|| for every x € V. O 


Theorem 3.4.38. Let V be an inner product space. A linear operator 
f:V—- V is isometric if and only if it preserves the inner product, that 
18, 


(f(x), f(y)) = %y) 


for every x,y € V. 


Proof. If (f(x), f(y)) = (x, y) for every x,y € V, then 


I F()IPF = (F(%), F%)) = 6, x) = Hlerll? 


3.4. SPECTRAL THEOREMS 183 


for every x € V and thus f is an isometric operator. 
Now assume that f is an isometric operator. From the Polarization Identity 
(Corollary 3.1.34) we get 


(F(x), f(y) 
(f(x) + F(V)IPF — IFC) — FYDIP + AF) + EFI? — a F(&) — iF) 


(f(x +y)IP? — If —y)IPF + af + ty)? — al fe — éy)I|") 


Ble BIL R Ble 


(Ix + yl? — [lx — yl? + illx + éyl]? — il]x — dyl]?) = (x,y) 


for every x,y € V. O 


Theorem 3.4.39. On a finite dimensional inner product space a linear 


operator is unitary if and only if it is isometric. 


Proof. Let V be a finite dimensional inner product space and let f : V — V be 
a unitary operator. For every x € V we have 


IFC)? = (F(%), FO) = Ge FF) = (x) = Ix, 


which means that f is an isometric operator. 
Now we assume that V is a finite dimensional inner product space and f : 
VY — Y is an isometric operator. Note that the equality || f(x)|| = ||x|| implies 
that f is injective and thus dimker f = 0. If dimV = n, then dimranf =n 
and hence ran f = V and f is invertible. In other words, f is an isomorphism. 
By Theorem 3.4.38, for every x,y € V we have 


(f° F(x), ¥) = (F(x), Fly)) = &y), 


and hence f* f = Id. Moreover, since f is an isomorphism, for every x € Y, 
there is a y € V such that x = f(y) and we have 


ff (x) = ff fly) = fly) =x, 


because f* f = Id. Hence f f* = Id. O 


From Theorems 3.4.38 and 3.4.56 we obtain the following geometric char- 
acterization of unitary operators on finite dimensional inner product spaces: 
unitary operators are linear operators that preserve the norm or the inner prod- 
uct. 


184 Chapter 3: Inner Product Spaces 


Corollary 3.4.40. Let V be a finite dimensional inner product space 
and let f : V > V be a linear operator. The following conditions are 
equivalent: 


(a) f is unitary; 
(b) ll f(ex)|] = I|xll for every x € V; 
(c) (f(x), f(y)) = (x,y) for every x,y € V. 


In the following theorem we characterize unitary operators on V in terms 
orthonormal bases in V. 


Theorem 3.4.41. Let V be a finite dimensional inner product space 
and let f : V > V be a linear operator. The following conditions are 
equivalent: 


(a) f is unitary; 
(b) f is normal and |\| = 1 for every eigenvalue X of f; 


There is an orthonormal basis {e1,...,en} of V such 
{f(e1),-.-;f(en)} ts an orthonormal basis of V; 


For every orthonormal basis {vi,...,Vn} of V, {f(vi),---;f(Wn)} 
is an orthonormal basis of V; 


V has an orthonormal basis {e1,...,@n} of eigenvectors of f corre- 
sponding to eigenvalues r1,...,An such that |Ay| =--+ = |An| = 1; 


f(x) = See Aj (x,e;)e; for every x € V, where {e1,...,en} is an 
orthonormal basis of V and |A1| =--++ = |An| = 1. 


Proof. If f is unitary, then f is normal and, by Corollary 3.4.17, there is an 
orthonormal basis {e1,...,en} of V consisting of eigenvectors of f. Let A; be 
the eigenvalue of f corresponding to the eigenvector e;. Then 


[Ag] = HAsesll = I fCes)Il = llegll = 1, 


so (a) implies (b). 
Now we assume that f is normal and |\| = 1 for every eigenvalue X of f. 


By Corollary 3.4.17, there is an orthonormal basis {e),...,e,} of V consisting 
of eigenvectors of f. Let A1,...,An be the corresponding eigenvalues. Since 
|Ai] =--- =|An| = 1, we have 


(f(e5), F(ex)) = (Agej, Avex) = ApAK (Cj, Ex) = Oje 


3.4. SPECTRAL THEOREMS 185 


and thus {f(e1),..., f(en)} is an orthonormal basis of Y. This shows that (b) 
implies (c). 

Next we assume that there is an orthonormal basis {e),...,e,} of V such 
{f(e1),-.-,f(en)} is an orthonormal basis of V. Let {vi,...,Vn} be an or- 
thonormal basis of V. Then for all 7 € {1,...,n} we have 


n 
vVi= s Ajmem, 


m=1 


where Qjm = (Vj,@m). Since 


(f(vi), f(ve)) = (; (>: on) nf (> vont) 
l=1 m=1 


=S— SO ajitmm(f (er); f(em)) 


l=1 m=1 


nm nm 
=) S O51 Bem lm 


l=1 m=1 


n n 
= DS oe Q5j10km (Cl, Em) 


l=1 m=1 


= (Sones ss vent 


m=1 
= (Vj, Vk) = Ojk; 
{f(v1),---, f(vn)} is an orthonormal basis of V, proving that (c) implies (d). 


Let {vi,...,Vn} be an orthonormal basis of V. If {f(v1),...,f(vn)} is an 
orthonormal basis, then 


2 2 
If)? =F | Soe vaya | = PSS va) fvs) 
j=l j=l 
= So it, vs) PFI? = 32 l(x, va)?? = [xP 
j=l j=l 


for every x € Y. Consequently, by Corollary 3.4.40, f is unitary and thus (d) 
implies (a). 

So far we have proved that (a)-(d) are all equivalent. Clearly (b) implies (e) 
and (e) implies (f). 

Finally, if we assume that there is an orthonormal basis {e),...,en} of V 
and |Ai| = --- = |A,| = 1 such that 


x) = 5° Aj (x, ee; 
j=l 


186 Chapter 3: Inner Product Spaces 


for every x € Y, then 
Ifo? = ma x,e;)ej|| = d [Aj (x, ese,” 


= DM I(x, es esl” = Lee) ej||” = |Ix|l?. 


By Corollary 3.4.40, f is unitary and thus (f) implies (a), completing the proof 
of the theorem. O 


Example 3.4.42. Let {a,b,c} and {u,v,w} be two orthogonal bases in an 
inner product space V. Find an isometry f such that Span{ f(a)} = Span{u}, 
Span{ f(b)}) = Span{v}, and Span{ f(c)} = Span{w}. 


Solution. 


rae) om Cet ae) cb 


Example 3.4.43. Consider the operator f € £(C?, C?) defined by f(x) = Ax, 


where 
u 4+%i 2-—2i 
512-24 1+4i|° 


Show that f is unitary and find its spectral decomposition. 


A= 


Solution. We can verify that f f* = f* f = Id by simple matrix multiplication. 
Since the roots of the equation 


(4+4—A)(1 + 47 — A) — (2 — 21)? = )? — (5 + 54)A + 257 = 0 


are 5 and 52, 1 and 7 are eigenvalues of f and 


28 alle 

v5 Le 
are unit eigenvectors corresponding to 1 and 7. Consequently, 
2 eu sal ails 
Ty ial V5 V5 ~f | 21 v5 v5 
v5 v5 v5 v5 


3.4. SPECTRAL THEOREMS 187 


The operator in the next example is a normal operator that is not unitary. 


Example 3.4.44. Consider the operator f € £(C?, C?) defined by f(x) = Ax, 
where 


A=| 54 oe 


—3+41 52 


It is easy to verify that ff* = f*f = 0, so f is a normal operator, but it is 
not unitary. Since 


(toi) = [a] ema + (ito) =20 [fo 


f has eigenvalues 0 and 10% and es | and E | are corresponding 


eigenvectors. 


3.4.4 Orthogonal operators on real inner product spaces 


In this section we discuss the orthogonal operators on finite dimensional real 
inner product spaces, that is, operators that preserve the inner product. All 
inner product spaces considered in this section are assumed to be real. 


Definition 3.4.45. Let V be a real inner product space. A linear oper- 
ator f : V > V is called an orthogonal operator if 


(f(x); fly)) = & y) 


for every x,y € V. 


Theorem 3.4.46. Let V be a finite dimensional real inner product space 
and let f : V > V be a linear operator. The following conditions are 
equivalent: 


(a) f is orthogonal; 
(b) ff* = f*f =Id; 


(c) IIF(&)I| = |lxl| for every x € V. 


188 Chapter 3: Inner Product Spaces 


Proof. Let f : ¥V > V be an orthogonal operator. For every x, y € V we have 


(f° F(x), ¥) = (F(x), Fy) = &% y) 
and thus 
(f° f(x) — 2, y) = 0. 
If we take y = f* f(x) — z, then we get 
IIf* F(x) — al]? = (FF (x) — &, f* F(x) — 2) = 0. 


Consequently, f* f(x)—a = 0, and hence f* f = Id. Since V is finite dimensional, 
we also have f f* = Id, by Theorem 2.2.11. This proves that (a) implies (b). 
If ff* = f* f =Id, then for every x € V we have 


IFC)? = (F(%), F(X) = (x, F°F(%)) = (xx) = [IxI]?, 


and thus (b) implies (c). 
Finally, if || f(x)|| = ||x|| for every x € V, then 


2(F (x), fy) = Ife +y)I? - FCO? — IFODIP 
= |x + yll? — Ilxll? — Ilyll? = 2x, y) 


for every x,y € VY. This proves that (c) implies (a). Oo 


While some properties of orthogonal operators on finite dimensional real 
inner product spaces are the same as properties of unitary operators on finite 
dimensional complex inner product spaces, there are some essential differences. 


Theorem 3.4.47. Let V be a finite dimensional real inner product space 


and let f : ¥V— V be an orthogonal operator. If X is an eigenvalue of f, 
then \=1 orX=-1. 


Proof. If v be an eigenvector corresponding to the eigenvalue X, then f(v) = Av 
and thus 

[Alllvll = IlAvl] = lf(v) I = Iv. 
Consequently, |A| = 1 because v 4 0 and, because A is a real number, = 1 or 


A=-1. O 


Note that the above theorem does not say that every orthogonal operator 
on a finite dimensional real inner product space has an eigenvalue. Indeed, if 
f : R? > R? is the operator f(x) = Ax, where 


i 
a-[rd: 


3.4. SPECTRAL THEOREMS 189 


then 


(fed 4 GI) (0S) [ai]) mem tee (Ie) Be) 


so f is an orthogonal operator. On the other hand, since 


A -1 


= se 49 
act | 1 “=A +1, 


f has no real eigenvalues. 


Lemma 3.4.48. Let V be a finite dimensional real inner product space 


and let f : V + V be an orthogonal operator. If U is an f-invariant 
subspace of V, then U+ is also f-invariant. 


Proof. Let U be an f-invariant subspace of V. Since f is an isomorphism, we 
have f(U) = U. Let v € Ut and u € U. Then there is w € U such that 
f(w) = u and we have 


(f(v),u) = (f(v), f(w)) = (v,w) = 0. 
Thus f(v) €Ut. Oo 


Lemma 3.4.49. Let V be a finite dimensional real inner product space 
and let f : V + V be an orthogonal operator. There is an f-invariant 


subspace U CY such that dimU = 1 or dimU = 2. 


Proof. Since the operator f + f* is self-adjoint, it has an eigenvalue \ € R. Let 
v be an eigenvector corresponding to A. Then 


(f+ fv = av 


and hence 
(ff+ ff )v = fv). 


Since f f* = Id, the above can be written as 


fP(v) tv =Af(v) 


or 
Pv) =Af(v) -v. 


Consequently, f?(v) € Span{v, f(v)} and thusU = Span{v, f(v)} is f-invariant 
and dimU/ = 1 or dimU = 2. O 


190 Chapter 3: Inner Product Spaces 


Theorem 3.4.50. Let V be a finite dimensional real inner product space 
and let f : V + V be an orthogonal operator. There are f-invariant 
subspaces U,,...,Un CV such that 


V=N08-:-OUy 


and dimU; = 1 or dimU; = 2 for every j € {1,...,n}. 


Proof. By Lemma 3.4.49 there is an f-invariant subspace YU, C V such that 
dimU, = 1 or dimU/; = 2. We have V =U, OU}. 

Now, by Lemma 3.4.48, Uj is f-invariant and thus we can define an operator 
g: Ut > Ud by g(x) = f(x) for every x € Ut. Clearly g is an orthogonal 


operator. Since dimUt = dimY — 1 or dimUt = dimV — 2, we can apply 
Lemma 3.4.49 to g and proceed as before. This gives us the desired result by 
induction. O 


If f : ¥V > V is an orthogonal operator on a real inner product space V, then 
by the above theorem, V is a direct sum of f-invariant subspaces U4),...,U, of 
dimension 1 or 2. If dim; = 1, then f(v) = v or f(v) = —v for every v € Uj. 
Now we are going to consider the case when dimU; = 2. 


Theorem 3.4.51. Let V be a real inner product space such that dim VY = 
2 and let f : V > V be an orthogonal operator. If {v,w} is an orthonor- 
mal basis of V such that 


f(v) =av+bw and f(w)=cv+dw, 


where a,b,c,d € R, then one of the following conditions holds 


(a) ad—bc=1 and there is a unique number 0 € (—7,7] such that 


f(v) =cosOv+sinOdw and f(w) =-—sinOv+cosé w; 


(b) ad— bc = —1 and there is an orthonormal basis {u1, u2} of V such 
that f(u1) = uy and f(ug) = —upe. 


Note that in the second case u, and uz are eigenvectors of f corresponding 
to eigenvalues 1 and —1, respectively. 


Proof. From || f(v)||? = || f(w)|? = 1 and (f(v), f(w)) =0 we get 


a2 +02 =1 
C4+@=1. 
ac + bd=0 


3.4. SPECTRAL THEOREMS 191 


Ifa #0, then c= —td = —4b and d= —4a. If we let t = —¢, then we have 


Lee? +0 =a? +07) = 


and thus t = 1 or t = -1. 

If t = 1, then c= —b, d =a, ad — bc = 1, and there is a unique @ € (—7, 7] 
such that a = cos@ and b = siné. 

If t = —1, then c = b, d = —a, and ad— bc = —1. Since for any p,q,z,y € R 
we have 


(f(pv + qw), cv + yw) = apa + bgx + bpy — aqy = (pv + qw, f (xv + yw)), 


f is self-adjoint. Because f is orthogonal, if \ is an eigenvalue of f, then A = 1 
or A= —1. 

Next we calculate the corresponding eigenvectors. 

To solve the equation 


Flav + yw) — (av + yw) = (x(a — 1) + yb)v + (ab — (a+ Dy)w =0 
we need to solve the system 


es 
xb—(a+1)y=0° 


It is easy to verify that « = —b and y = a— 1 is a nonzero solution if b 4 0 or 
a #1. The same way we show that the equation 


f(ev + yw) + (2v + yw) = (x(a + 1) + yb)v + (2b — (a — 1)y)w =0 


has a nonzero solution x = —band y=a+1ifb#0ora¥4-l. 

Note that, as expected, the vectors —bv + (a—1)w and —bv — (a+ 1)w are 
orthogonal because a? + b? = 1. 

The cases b = 0 and a = 1 as well as b = 0 and a = —1 are trivial. Conse- 
quently, there is a basis {u;, ug} of V consisting of orthonormal eigenvectors of 
f such that f(ui) =u; and f(u2) = —up. 

Finally, if a = 0, then d = 0 and 6? = c? = 1. There are four possibilities: 


In the first two cases ad — bc = 1 and in the last two cases ad — bc = —1. In all 
cases the operator f is self-adjoint and has eigenvalues 1 and —1. oO 


192 Chapter 3: Inner Product Spaces 


Using Theorem 3.4.51 we can give a more detailed description of the f- 
invariant subspaces in Theorem 3.4.50. 


Theorem 3.4.52. Let V be a finite dimensional real inner product space 
and let f : V + V be an orthogonal operator. There are f-invariant 
subspaces Uy,...,Upn, W1,..-, Wa, ¥1,--.,4tr GV, such that 


V=U0--OUpOW1G::-OW, ONG: OX, 


and 


(a) for every 7 € {1,...,p}, dimU; = 1 and there is a nonzero vector 
u,; €U; such that f(u;) = u,; 


(b) for every k € {1,...,q}, dim Wz, = 1 and there is a nonzero vector 
wr © W, such that f(w,) = —we; 


(c) for every | € {1,...,r}, dim, = 2 and there are orthonormal 
vectors X1,y. € X, and a unique 6; € (—7, 7] such that 


f (x1) = cos x + sin % yi 
f(yi) = —sin & x; + cos yi. 


3.4.5 Positive operators 


There is some similarity between operators on a complex inner product space 
and the complex numbers. Self-adjoint operators are like real numbers and 
unitary operators are like complex numbers of modulus 1. Now we are going to 
consider operators that behave like nonnegative numbers. 


Definition 3.4.53. Let V be an inner product space. A linear operator 
f:V- Vis called positive if 


(f(x), x) 2 0 


for every x € V. 


If f : V > V is a positive operator, then (f(x), x) is a real number for every 
x € V and thus 


(F(x), %) = (Fx), x) = (x, Fx), 


which shows that positive operators are self-adjoint. 


3.4. SPECTRAL THEOREMS 193 


Example 3.4.54. Consider the vector space C({a, b]) of complex-valued con- 
tinuous Tt one defined on an interval [a,b] with the inner product (f,g) = 


ie f(t)g(#) dt and let gy : [a,b] > R be a positive continuous function. Show 
that ie oe ® : C([a,b]) + C([a,b]) defined by ®(f) = yf is a positive 
operator. 


Solution. For all f € C({a,b]) we have 


b b 
(®(f), f) = / v(t) f()F@ at = i p(t) f()/ dt > 0. 


Example 3.4.55. Let V = C® be the vector space of all infinite sequences of 
complex numbers with only a finite number of nonzero terms with the inner 
product defined as 


((1,22,---),(y1,y2,---)) = tii: 


Consider the linear operator f : C°° — C° defined as 


f (21, 22, a =) = (a121, A2%2, ob i) 


where a1,Q2,... are arbitrary complex numbers. Show that f is a positive 
operator if and only if all a;’s are positive real numbers. 


Solution. If all a;’s are positive real numbers, then for every (21, Z2,...) € C® 
we have 


ACH eiinncs 2) Ghigo) = ea (GARConee))) 
=Yla Oj XjT;Z = Yasin 2 0, 
ji 


so f is a positive operator. 

Now assume that (f(x1,22,...),(@1,22,...)) > 0 for every (x1, %2,...) € 
Cx: 

If, for every integer 7 > 1, we denote by e; € C™ the sequence (x1, %2,...) 
such that x; = 1 and x, = 0 for k # j, then we have 


0 < (fle;),e7) = a;. 


194 Chapter 3: Inner Product Spaces 


The operations of adjoint of an operator and conjugate of a complex number 
have similar algebraic properties. For example, for any complex number z we 
have zZ > 0. In the next theorem we formulate a similar property for linear 
operators. 


Theorem 3.4.56. Let V be a finite dimensional inner product space 
and let f: V— V be a linear operator. The operators f f* and f*f are 


positive. 


Proof. For every x € VY we have 
(ff (x),x) = (f° Cx), *(x)) = IF GQIP = 0 
and 
(f° £(%),x) = (f(%), £)) = IFC)? = 0. 
O 


The following theorem is similar to Theorems 3.4.16 and 3.4.30. These three 
theorems characterize normal, self-adjoint, and positive operators in terms of 
their eigenvalues. 


Theorem 3.4.57. Let V be a finite dimensional inner product space and 
let f: V— V be a nonzero linear operator. The following conditions are 
equivalent 


(a) f is a positive operator; 


(b) There are orthonormal vectors e1,...,e, and positive numbers 
A1,---,Ap such that for every v € V we have 


f(v) = 3 Aj (Vv, 5) e535 


(c) There are orthonormal vectors e1,...,e, and positive numbers 
A1,---,Ar such that 


Proof. If f is a positive operator, then it is self-adjoint and there are or- 
thonormal vectors e,,...,@,- and nonzero real numbers j,...,A, such that 
f(v) = je Aj; (v,e;)e; for every vector v € V. Since, for every 1; we have 


0 < (f(e;),e3) = (Ayes, e5) = Ay (ey, e3) = Agllesll? = Ay, 


3.4. SPECTRAL THEOREMS 195 


Ai,-.-,A,r are positive numbers. This shows that (a) implies (b). 
Conditions (b) and (c) are equivalent, because proje,(v) = (Vv, e;)e;. 
Now we assume that for every v € V we have f(v) = et Aj (Vv, €;)e5, 

where {e;,...,e,} is an orthonormal set and A1,...,A, are positive numbers. 

Then for every v € V we have 


4) = (S Asines)eny] = 2 Aiweslenv) 
DM (v,e;)(v,e;) = Daven? 


This shows that (b) implies (a), which completes the proof. O 


Example 3.4.58. Consider the operator f € £(C?, C3) defined as f(x) = Ax 
where 


14 2 4 
A=| 2 17 -2 
4 —2 14 


Show that f is a positive operator. 


Solution. Since the matrix A is symmetric, the operator f is normal and thus 
C3 has an orthonormal basis consisting of eigenvectors of f. Hence, to show 
that f is a positive operator, it suffices to show that all eigenvalues of f are 
nonnegative numbers. If X is an eigenvalue of f, then the following system has 
a nontrivial solution. 


(14— X)e -- 2y-+42=0 
2x + (17 — A)y — 22 =0 
4g — 2y+ (14-— A)z=0. 


If we add the first and the third equations, we get 
(18 — A)(a@ + z) =0. 


It is easy to see that for \ = 18 the system has nontrivial solutions. If \ 4 18, 
the system is equivalent to the system 


Tele =0 
2x + (17 — A)y — 22 =0 
4g — 2y+ (14-— A)z=0. 


If we let z = —a, then we get 


4x + (17 — A)y =0 
O= 10)2= 27 =0. 


196 Chapter 3: Inner Product Spaces 


We multiply the first equation by 10—A and add to the second and get 


(BS *ar-r-2)}y=0 


or 
(A? — 27 + 162) 
4 
The roots of the equation \? — 27 + 162 = 0 are 9 and 18. 
If\ £9 and » ¥ 18, then the only solution is z = y = z = 0. Consequently, 
9 and 18 are the only eigenvalues of f. Since these are positive numbers, f is 
a positive operator. Oo 


y=0. 


The square root of a positive operator 


Every positive real number has a unique positive square root. A similar property 
holds for positive operators. 


Definition 3.4.59. Let V be an inner product space and let f: V— V 


be a linear operator. An operator g: V > VY is called a square root of f 
io =, 


Example 3.4.60. Let V = C®™ be the vector space of all infinite sequences of 
complex numbers with only a finite number of nonzero terms with the inner 
product defined as ((11,%2,...), (y1,y2,---)) = Dojo. @s9y- Hf f : CP + C” 
is the operator defined as 


f (x1, 22, as ) = (A121, A222, eEars , 
where a1, @2,... are positive numbers, then the operator g : C° — C®™ defined 


as 
g(x1, La, oa es) = (,/a121, (A222, rene } 


is a square root of f. 
Note that every operator of the form 


h(ai,@2,...) = ((-1)™ Varn, (-1)™Jaaze,...), 


where n,; € {1,2}, is a square root of f, but the only square root of f that is 
a positive operator is the operator g defined above. 


3.4. SPECTRAL THEOREMS 197 


Theorem 3.4.61. Let V be a finite dimensional inner product space and 
let f: V—- V be a positive operator. There is a unique positive operator 


g:V—Y such that g? = f. 


Proof. We offer two different proofs of existence and two different proofs of 
uniqueness. 

The first proof of existence: Without loss of generality we can assume that 
f is a nonzero positive operator. By Theorem 3.4.57, for every x € V we have 


f(x) = 2 Aj (X, €5)€;, 


where {e;,...,e,} is an orthonormal set and 1,..., A, are positive numbers. 
It is easy to verify that for the positive operator 


gx) =o Vee 


we have g? = f. 

The second proof of existence: Since f is a positive operator, all eigenvalues 
of f are nonnegative. Let p be a polynomial such that p(\) = VA for every 
eigenvalue \ of f. Since, by Example 3.4.18, (p(f))? = f and p(f) is a positive 
operator, we can take g = p(f). 

The first proof of uniqueness: Let p be a polynomial such that p(A) = VA 
for every eigenvalue of f and let g = p(f). Now assume h is a positive 
operator such that h? = f. If u is any eigenvalue of h and v is an eigenvector 
corresponding to p, then w > 0 and f(v) = p?v, which means that p? is an 
eigenvalue of f and thus p(y?) = jz. Consequently, 


g(v) = v(F)(v) = p(h?)(v) = p(H?)(v) = ev = hv). 


Since g(v) = h(v) for every eigenvector of h and there is a basis of V of eigen- 
vectors of h, we can conclude that g = h. 

The second proof of uniqueness: Let p be a polynomial such that (p(f))? = f 
and let h be a positive operator such that h? = f. Then 


hf =hh? =h?h = fh 


and consequently 


hg = hp(f) = p(f)h = gh, 
where g = p(f). Since h? — g? = 0, for every v € V we have 


0 = ((h—g)v, (h? — g*)v) = ((h— g)v, (h + 9)(h—g)v) 
= ((h — g)v,h(h — g)v) + ((h— g)v, g(h — g)v). 


198 Chapter 3: Inner Product Spaces 


This gives us 


(h— g)(v),h(h — g)(v)) =0 and ((h—g)(v),g(h — g)(v)) = 9, 
because both h and g are positive operators, and consequently 


((h — g)(v),(h — g)(h — g)(v)) 
= (h—g)(v), h(h — g)(v)) — ((h — 9)(v), 9(A — g)(v)) = 0. 
Hence ((h—g)3(v), v) = 0, which gives (h— g)? = 0. Since h—g is a self-adjoint 
operator, we conclude that h — g = 0, by Theorem 3.3.15. O 


The unique positive square root of a positive operator f will be denoted ./f. 


Example 3.4.62. Consider the operator f € £(C?, C?) defined as f(x) = Px, 


where 
ee 33 24 — 247 
~ (24+ 244 57] ° 


Show that f is a positive operator and find its positive square root. 


Solution. First we find the spectral decomposition of f: 


f = 9proj 1-3 + 81proj fear 
—1 2 


Thus f is a positive operator and 


Ni = 3proj ; = | + 9proj [ - i : 
2 


=i 
Since 
(all) Gobet a= ba ap 
(al al) Galea) abused Oa) 
and 
a ata basy a= baa af 
we have 


3.4. SPECTRAL THEOREMS 199 


Example 3.4.63. In Example 3.4.58 we show that the operator f € £(C?, C3) 
defined as f(x) = Ax, where 


i 24 
ANE? W725 
Ae 4 


is a positive operator. Find the spectral decomposition of /f. 


Solution. In Example 3.4.58 we found that A = 18 and \ = 9 are the eigenval- 
ues of f. We need to find an orthogonal basis of C3 consisting of eigenvectors 
of f. 

For \ = 18 the system 


(4 je 2y 4 42 =—0 
2x + (17 — A)y — 2z=0 
4x — 2y+ (14-— A)z=0 


is equivalent to the equation 
22 — yi — 22:— 10); 


This means that 


G 1 0 
Cie 22 —22| :2,2€C > =Span<¢ |} 2] , |-2 
Bs 0 1 
1 0 
Note that the vectors |2} and }|—2] are not orthogonal. The projection of 
0 1 
1 0 0 
the vector }2] on Span { |—2] > is —2 |—2| and thus the vector 
0 1 1 
1 0 5 
4 1 
2 5 Dy = 2, 
0 il 4 
0 i 0 
is orthogonal to |—2] and 2) ,|-2 is an orthogonal basis of Ej. 


1 4 1 


200 Chapter 3: Inner Product Spaces 


For 4 = 9 the system 


(14 — A) + 2y+ 4z=0 
2x + (17 — A)y — 2z=0 
4x — 2y+ (14-— A)z=0 


is equivalent to the system 


t+z =0 
4r+8y =0, 
—x—2y=0 
which means that 
—2y —2 
(ey = y| :y€C> =Span 1 
2y 2D 


As expected the vector 1} is orthogonal to the vectors from €)g and 


5 0] [2 
Hr a ey He 
4 1 2 


is an orthogonal basis of eigenvectors of f. Since 


f = 18proj 5) + 18proj on 9proj any 
2 —2 il 
4 1 2, 
we have 
i f = 3V2proj Sale 3V 2proj 07) + 3projp_o9y- 
2 —2 1 
4 1 2) 


3.5 Singular value decomposition 


Spectral decomposition is formulated for operators on an inner product space, 
that is, operators f : YV — V where Y is an inner product space. Now we 
are going to discuss a decomposition similar to spectral decomposition for lin- 
ear transformations between two different inner product spaces. We begin by 
presenting an example which will motivate the main result of this section. 


3.5. SINGULAR VALUE DECOMPOSITION 201 


Example 3.5.1. Let V and W be finite dimensional inner product spaces and 
let f : V — W be a linear transformation. Suppose for some orthonormal 
vectors V1, V2 € V we have 


f° f(x) = 49(x, vi) vi + 25(x, va) v2 


for every x € V. Let wi = 4f(vi) and wo = #f(v2). Show that ||wi|| = 
|| wal] = 1, (wi, w2) = 0, and 


f(x) = 7(x, vi) wi + 5(x, v2) we 
for every x € V. 


Solution. Since 


If (vill? = (f*f(vi), v1) = 49 and || f(v2)I|? = (f* f(v2), v2) = 25 


we have 
il 1 
Iwil =] s000)] =1 and fovall= [Ertve)| =1 
and 
1 
(wi, W2) = ae (f (v1), f(v2)) 
1 
= gp fF (v1),V2) = ge (491, va) 
49 
= 35 Vi) V2) = 0 
Now we extend the set {v1, v2} to {vi,v2,...,Vn}, an orthonormal basis 


of VY. Then, for every x € V, we have 
x = (xX, v1)V1 + (X, V2) Vo + °°: + (X, Vn) Vn 


and thus 
F(x) = (%, vi) f(v1) + (x, va) F(v2) +++ + 6 Vn) f(vn)- 
Note that || f(v;)|l? = (f*f(v;), vj) = 0 for j > 2. Hence 


f(x) = (x, v1) f(v1) + (X, v2) f(v2) = 70x, vi)wi + 5(x, V2) Wo. 


202 Chapter 3: Inner Product Spaces 


Theorem 3.5.2. Let V and W be finite dimensional inner product 
spaces. For every nonzero linear transformation f : V + W there are 
positive numbers o1,...,0,, orthonormal vectors v1,...,Vr € V, and 
orthonormal vectors W1,...,W,r € W such that 


x)= Ds Oj (X, Vj) W 


for everyx€ VY. 


Proof. The operator f* f is self-adjoint. Since 


(f° f(%), x) = (F(x), £)) = IFC? = 0, 


for every x € V, f* f is a nonzero positive operator and thus there are orthonor- 
mal vectors v1,...,V, and positive numbers A;,...,A, such that 


x) = $0 Aj (x, vy) Vv; 
j=1 


for every x € Y. Now we extend the set {v1,...,v,} to an orthonormal basis 
{vi,...,Vn} of V. Then, for every x € V, we have 


Since, for every 7 = 1,...,r, we have 


lf (va)I? = CFF (v5), va) = Asllvall? = 


and || f(v,;)||? = (f*f(v;), vj) =0 for j > r, we can write 


1 he 


f(x) = DU vi) f (vs) = > II F(vs)II x, a co edaele 


j=1 


If we let oj = ||f(vj)|| = V/A; and w; = Fw Tres (va), for 7 <r, then we obtain 
the desired decomposition of F: 


Tr 
x)= S- Oj (X, Vj) Wy 
j=l 


for every x € V. 


3.5. SINGULAR VALUE DECOMPOSITION 203 


Moreover, for every j,k € {1,...,7r} we have 
(wen) = (To stv))s Fla) ) = Uy) (v0) 
Wk) =( 7 is TeV) ) = SG), PV 
’ If(vs 7? WF ve) Ojon” 
i 1 dj 
= (f° f(v5), Ve) = —— (Aj V5, VE) = —(Vj,Vk); 
Oj0k Oj0k OjOk 
so the vectors w),...,w, are orthonormal. oO 


Corollary 3.5.3. Let V and W be finite dimensional inner product 
spaces. For every nonzero linear transformation f : V — W there are 
orthonormal bases v1,...,Vn € V and wi,...,Wm € W and positive 
numbers 01,...,0r, for somer <m andr <n, such that 


f (v5) = o5wj 


for j =1,...,r and f(v;) =0 forgj=rt+l,...,n, ifn>r. 


Proof. Let vi,..-,Vn © V, Ai,---;Ar;01;---;0r > 0, and w1,...,w, € W be 
as defined in the proof of Theorem 3.5.2. It suffices to extend {wi,...,w,} to 
an orthonormal basis of W. O 


The representation of a linear transformation between inner product spaces 
given in Theorem 3.5.2 is called the singular value decomposition of f. Tf f : 
Y > VY is a positive operator, then its singular value decomposition is the same 
as its spectral decomposition. If f : ¥V > V is a normal operator and 


F(x) = > Ay (x, vy)vy 
j=l 
is its spectral decomposition with every 4; 4 0, then 
r dj 
f(x) =o Asi, Yi? 
j=l 2 


is its singular value decomposition. Note that, if A; is a nonzero real number, 
then na is 1 or —1. 
el 


Example 3.5.4. Let f : V ~ W, vi,...,Vr € V, 01,.-..,0r > 0, and 
W1,.-., Ww, € W be as defined in Theorem 3.5.2. Show that 


f(y) = Do oi l¥s WI) 


204 Chapter 3: Inner Product Spaces 


for every y € W. 


Solution. 
(f(x),¥) = is a(x, wiv) = Sloat V5) W3sY) 
= 2 o5(%,V3)(W3,¥) = y 05 (%, (W3,¥)Vy) 
= Yate (Y¥; W3)Vy) = (x Soninmys = (x, f*(y)). 


The following theorem can be interpreted as a form of uniqueness of the 
singular value decomposition. 


Theorem 3.5.5. Let V and W be finite dimensional inner product spaces 
and let f : V + W be a nonzero linear transformation. If there are 
positive numbers o1,...,0,, orthonormal vectors v1,...,Vr € V, and 
orthonormal vectors W1,...,W,r € W such that 


f(x) = DE 05 (X, V5) Wj 


for every x € VY, then 


FFG) = og, evi vi 
j=l 


for everyx€ VY. 


Proof. From the result in Example 3.5.4 we get 


ff) = doi hF(%), Ws)V5 
= bs om (x OK(X, Vk) We, w) Vv; 
j=l k=1 


a 
= bs o* (x, V5)V; 
j=l 


for every x € V. O 


3.5. SINGULAR VALUE DECOMPOSITION 205 


Example 3.5.6. Let f : V ~~ W, vi,...,vr € V, o1,..-.,0, > 0, and 
Wi,...,w, € W be as defined in Theorem 3.5.2. Show that the linear trans- 
formation ft : W— V defined as 


fy =e = (vi wi)y5 


= 


is the unique linear transformation g from W to V such that the following two 
conditions are satisfied: 


(a) of — Pola 


(b) g(y) = 0 for every vector y € (ran f)+. 


Solution. Since, for every k € {1,...,r}, 


fT f(ve) = fF (3: vive) = f* (on (Vk, Vk) Wk) = OKF* (we) 


j=l 

= = Pera! wa See 

=dO0k LS 5, hws Wi)V; =O0k mae Wk) Viz =V,= PLO} (ker pf) (Ve), 
7 


we have f* f = proj er s)1, because (ker f)+ = Span{vi,..., vr}. 
If y € (ran f)+, then (y, w;) = 0 for every j € {1,...,r} and thus 


Tr 


Po > (vw) =) 


j=1 


Now assume that a linear transformation g : W — Y satisfies (a) and (b). 
Note that ran f = Span{w),...,w,} and W = ranf @ (ran f)+. If y € W, 
then y = y1 + ya, where y, € ran f and y2 € (ran f)+. Then y; = f(x) for 
some x € VY, and consequently 


9(y) = 9(y1 + y2) = 9(y1) + g(y2) = 9g f(x) +0 


= Dioige puis) — Soe, 4, =, (« Sri) vj 
= j 


j=l j=l 
r 1 1 Gi 
= az f(x), —f (vs) ) vj = — (¥1, Wj) Vj 
j=l 9 i 
=. 
= yy —(y, Wj) Vv; 
oO 
= 


206 Chapter 3: Inner Product Spaces 


Therefore, if g: W — VY is a linear transformation that satisfies (a) and (b), 
then g(y) = 05-4 a (¥s Wi) Vj. oO 


Example 3.5.7. Consider the operator f € £(C?, C7) defined as f(x) = Ax, 


where 
= 5t 3+ 42 
aS Ee + 4% a , 


Find the singular value decomposition of f. 


ee | —5i mee 


Solution. We have 


3-41 —54 
and 
he 50 40 — 302 
= Fane a y 


The eigenvalues of the matrix A*A are the roots of the equation 
(50 — A)? — (40 — 307)(40 + 302) = A” — 100A = 0, 
that is, A = 100 and A = 0, and the orthonormal vectors 
1 |4- 3% 1 |4-3i 
sal al ™ wal a 


are corresponding eigenvectors. Since 


(Pa fa) PS) = bl 


[Ls =20¥ 


and 


the singular value decomposition of f is 


#(E]) =2( Eel sve 31) (saga Lo): 


For practical calculations we can use a simplified form: 


(Bl) -s (REED Ea: 


3.5. SINGULAR VALUE DECOMPOSITION 207 


Example 3.5.8. Consider the operator f € £(C?,C?) defined as f(x) = Ax, 


where 
34+ 21 2-41 
Bee ote 


Find the singular value decomposition of f. 
Solution. First we calculate 


A= | 33 sae 


24 + 24% Sill 

: : 1-i 1-i 
The eigenvalues of the matrix A*A are 9 and 81 and i and 9| are 
corresponding eigenvectors. Consequently the singular value decomposition of 


bfeis 
f(x) = 3(x, v1) wi + 9(x, v1) Wa, 


wal the-atthe gl dealt 


Example 3.5.9. Consider the operator f € £(C?, C4) defined as f(x) = Ax, 
where 


ail 
i 
BS | ee 
eee? 


Find the singular value decomposition of f. 


Solution. First we find that 


jue? 
Sf) eit 2) iia a 
| ae pere 
-1 -2 


The eigenvalues of the matrix i | are 10 and 20, and 3] and | CURE 


208 Chapter 3: Inner Product Spaces 


corresponding eigenvectors. We have 


3) ll 5 3 #1 6) 
i 4) (all) || d 12 2! _ |0 
33) 2| = eles ee S| eas 
—l1 -—2 —5 —1 -2 0 
We note that 
5 5 
5 0 
| )=0 
—5 0 
The singular value decomposition of f is 
f(x) = V20(x, vi)wi + V10(x, v1)wa, 
where 
i il 
5 uals 
elZ i aes a 
Vi a yale es 2 ns NY 22 oo | 
V5 2 V5 v2 
=a 0 
2 


Example 3.5.10. Let P2((—1,1]) and Pi([-1, 1]) denote real valued polyno- 
mials on the interval [—1,1] of degree at most 2 and 1, respectively, with the 
inner product defined as 


d 
Find the singular value decomposition of the differential operator D = — 


Solution. First we note that {45 14/ 3t, t /5 (Be Bae )} is an orthonormal ba- 


sis in P2([—1,1]) and { 45, 3th is an orthonormal basis in P;([—1, 1]). The 


matrix of the differential operator with respect to these bases is 


DY vi 


3.5. SINGULAR VALUE DECOMPOSITION 209 


and we have 


0 aye a (oe 
eels aon aslorle eel: 
0 V15 00 15 


so the nonzero eigenvalues of the operator D* D are 3 and 15. The polynomials 


/3 t and VEG 347 — 4) are orthonormal eigenvectors of D* D corresponding 


to the nee ae 3 and 15. Since 


(yi) _ V3 
je(vey| “Va 


D(/3@-2)) 8 f 


and 


the singular value decomposition of the differential operator D : P2([—1,1]) > 
Pi({-1, 1)) is 


D(p(t)) = v3 (0 V3) ji+vE . iE (52 - 5)) 3, 
=3 [a nar fi (3t? — 1)p(t) dt t. 


While this result has limited practical applications, it is interesting that 
on the space P2((—1,1]) differentiation can be expressed in terms of definite 


integrals. O 


The polar decomposition 


We close this chapter with an introduction of another decomposition of operators 
on a finite dimensional inner product space, called the polar decomposition. To 
define the polar decomposition of an operator we need the notion of a partial 
isometry. 


210 Chapter 3: Inner Product Spaces 


Definition 3.5.11. Let V be an inner product space and let U be a 
subspace of Y. A linear transformation f : V > V is called a partial 
isometry with initial space U if the following two conditions are satisfied: 


(a) || F()|| = [|x|] for every x € UY; 


(b) f(x) =0 for every x €U+. 


Example 3.5.12. Let V be an inner product space and let g: V > V be the 
linear transformation defined as 


r 


ga) — So cca) by, 


71 


where {a;,...,a,} and {b,,...,b,} are orthonormal sets in V. Show that g 
is a partial isometry with initial space Span{aj,...,a,;}. 


Solution. If x € Span{a;,...,a,}, then there are numbers x1,...,2, € K such 
that 


X=7a,+-::+ Lar. 


Then 
(|x|? = lesan +++ + apa,||? = ||? +--+ + |, |? 


and, because (x,a;) = x; for all 7 € {1,...,r}, we also have 
IIg(x)||? = ||zvbi +--+ + 2pby||? = [ai]? +--+ + lar’. 


It is clear that g(x) = 0 for every x € Span{a,,...,a,}+. | 


For any z € C we have 2% = |z|? and thus |z| = zz. We use this analogy to 
define the operator |f| for an arbitrary linear operator on a finite dimensional 
inner product space V. If f : V > V is a linear operator, then f* f is a positive 
linear operator on Y and thus it has a unique positive square root \/f*f. We 
will use the notation 


l=V Pr 


In other words, for any linear operator f : V > V on a finite dimensional inner 
product space V there is a unique positive linear operator |f| : ¥V > V such that 


fl? = f*f. 


3.5. SINGULAR VALUE DECOMPOSITION 211 


Theorem 3.5.13. Let V be a finite dimensional inner product space and 
let f : V—- V be a linear transformation. There is a partial isometry 
g9: VY with initial space ran|f| such that 


f=alf\- 


This representation is unique in the following sense: If f = hp where 
p:V— VY is a positive operator and h is a partial isometry on V with 
initial space ranp, then p =|f| and g =h. 


Proof. If u = |f|(v) for some v € V, then we would like to define g(u) = f(v), 
but this does not define g unless we can show that |f|(vi) = |f|(v2) implies 


f(vi) = f (v2). Since 
FOI? = Fv), FY) = FF), ¥) 
= (f1?(v),v) = (FIC), LAI) = INTAI@IP, 


we have || f(v)|| = |I|f|(v)I| and thus 


IIF(v1) — F(va)Il = NFGva — va)Il = I fleva — v2)Il = Il fleva) — [fl (v2) I- 


Consequently | f|(v1) = |f|(v2) implies f(v1) = f(v2) and thus g is well-defined. 
Moreover, |lg(/Fi(v))I| = llf(¥)I| = IILf[(¥)Il, 80 9 is an isometry on ran|fi. 
Clearly, g(x) = 0 for every x € (ran|f|)+. Therefore, g is a partial isometry 
with initial space ran |]. 

Now assume that f = hp where p: V — Y is a positive operator and h is a 
partial isometry g : V > V with initial space ranp. Then for every v € V we 
have 


(f*f(v),v) = (f(y), F(v)) = (A(P(y), h(p(v)) = (p(y), p(v)) = (pv), v), 
which gives us f* f = p?, because f*f and p? are self adjoint, and thus p = |f\|. 
Clearly, g = h. Oo 


The representation of a linear transformation f : VY > V in the form pre- 
sented in Theorem 3.5.13 is called the polar decomposition of f. It is somewhat 
similar to the polar form of a complex number: z = |z|(cos@ + isin @). 


Example 3.5.14. Consider the operator f € £(C?, C2) defined as f(x) = Ax, 


where ; 
Fete jes ae . 


4+2i1 6+72 


Determine the polar decomposition of f. 


212 Chapter 3: Inner Product Spaces 


Solution. According to Example 3.4.62 we can write 
—1 
1-i 3+ 3% 1-% 9— 9 
(Pail) = Pad (02) = Pe 
(Poa =P Ss]. vial) =P ag 


Now we can define an isometry g : V > V such that 
3—31]/\ — [343% d 9 92 Ne (992 
a Bo weal eee cee 18) es 18 


g(x) = (x, v1) wi + (x, V2) Wo, 


ets ce se a 
2 


We have 


that is, 
where x € C?, 


walobe- ale al ald 


Then f = g|f| which is the polar decomposition of f. oO 


Example 3.5.15. Let V be an inner product space and let f: V > V bea 
linear transformation such that for every x € V we have 


i fee ca, a, 
j=l 


where {a;,...,a,} is an orthonormal set in V and \j,..., A, are positive num- 
bers. Find the polar decomposition of f. 


Solution. The linear operator |f| is defined by 
IGS) VT = 0 a, 
j=l 


for every x € VY. In the proof of Theorem 3.5.13 we show that 


llF(as)ll = INIfl(aa)ll = Vg > 0 


3.6. EXERCISES 213 


for 7 € {1,...,r}, so we can define the operator g : V > V by 


gly) = Soy, aj)b;, 
= 
where bj = Treprf (as) = Yat (a): The operator g is a partial isometry 


and it is easy to verify that f = g|f| which is the polar decomposition of f. 
oO 


3.6 Exercises 


3.6.1 Definitions and examples 


Exercise 3.1. Let s : V x V > C a positive sesquilinear form. Show that 
|s(x, y)|? < s(x, x)s(y,y) for every x,y € V. 

Exercise 3.2. Let V be an inner product space and let f : V > V be a positive 
operator (see Definition 3.4.53). Show that |(f(x),y)|? < (f(x), x)(f(y),y) for 
every x,y € V. 

Exercise 3.3. Let VY be an inner product space and let f : V > V be a positive 
operator (see Definition 3.4.53). Show that ||f(x)|[> < (f(x),x)||f?(x)|| for 
every X € V. 

Exercise 3.4. Let C?({0,1]) be the space of functions with continues second 
derivatives. Show that (f,g) = f(0)g(0) + f’(0)g’(0) + ie ff" (t)g" (t)dt is an 
inner product in C?({0, 1]). 


Exercise 3.5. Let v be a vector in an inner product space VY. Show that the 
function sy : V x V > C defined by sy(a, b) = (v, b)a is sesquilinear. 


Exercise 3.6. Let V be a finite dimensional inner product space. Show that 
for every linear operator f : V > V there is a nonnegative constant a such that 
| f(v)|| < a@l||v|| for every v € V. 

Exercise 3.7. Let V be an inner product space and let s: Vx V4 C bea 
sesquilinear form. Show that, if s(x, y) = s(y,x) for every x,y € V, then s = 0. 
Exercise 3.8. Let V and W be inner product spaces and let v € V and w € W. 
Show that the function w@v : V > W defined by w@ v(x) = (x, v)w is a linear 
transformation and we have f o (w ® v) = f(w) @ v for every linear operator 
f:WraW. 


Exercise 3.9. Let v € C” and w € C™. Find the matrix of the linear trans- 
formation w ® v as defined in Exercise 3.8. 


Exercise 3.10. Let UU, V, and &¥ be inner product spaces and let u € U, 
vi,Vv2 € VY, and w € W. Show, with the notations from Exercise 3.8, that we 
have (w ® v2)(vi ® u) = (v1, v2)(w @ u). 


Exercise 3.11. Show that 4|(a; +--+ + a@n)| < \/4(aj+---+ 2) for any 
@1,.--,4n ER. 


214 Chapter 3: Inner Product Spaces 


3.6.2 Orthogonal projections 


Exercise 3.12. Let {vi,...,Vn} be an orthonormal basis of the inner product 
space VY. Show, with the notation from Exercise 3.8, that Idy = ae Vi @ Vj. 
Exercise 3.13. Let V and W be inner product spaces and let {v1,...,Vm} and 
{wi,...,Wn} be orthonormal bases in V and W, respectively. Show that the 
set {we @® vj: 7 € {1,...,m},k € {1,...,n}} is a basis of L(W,V) (wz ® v; is 
defined in Exercise 3.8). 


Exercise 3.14. Determine the projection matrix on Span { Ei \ in C? and 


1 ; 
use it to determine the projection of the vector i on Span { i \ 
Exercise 3.15. Let V be an inner product space and let f : V — V be an 
orthogonal projection. Show that ||f(v)|| = ||v|| implies f(v) = v for every 
vey. 


Exercise 3.16. Let V be an inner product space and let f : V > V be a linear 
operator. Show that f is an orthogonal projection if and only if ker(Id—f) = 
ran f = ker ft. 


Exercise 3.17. Let U/ and V be subspaces of a finite-dimensional inner product 
space W. Show that projproj,, = 0 if and only if (u,v) = 0 for every uc U 
andv eV. 


Exercise 3.18. Let V be a finite-dimensional inner product space and let f : 
Y —> YV be a linear operator. Show, with the notation from Exercise 3.8, that 
the operator f is a nonzero projection if and only if there is an orthonormal set 
{vi,.-., Ve} CV such that f = ey v; ®vj. Determine ran f. 


Exercise 3.19. Let V be a finite-dimensional inner product space and let 
f : VY — Y be a linear operator. Show that there is an orthonormal basis 


{Vi,---,Vn} of V and numbers aj, for 1 < 7 < k <n such that 
f(v1) = 711V1, 
f (v2) = 12V1 + £22V2, 
f (v3) = £13V1 + £23V2 + £333, 


f(vn) = LinVvi tee + Tn—1,nVn—-1 + LnnVn- 
Exercise 3.20. Consider the vector space of aan it F defined on 
the interval [—1, 1] with the inner product (f, g) =f g(t) dt. Find the best 
approximation of the function t? by functions from are 
Exercise 3.21. Consider the vector space of eye Pia defined on the 
interval [—7,7] with the inner product (f,g) = + J”, f(t) g(t) dt. Determine 
the projection of the function t on Span{e”} fc for: all ee n>1. 


3.6. EXERCISES 215 


Exercise 3.22. Let V be an inner product space and let p),..., Dn be orthogonal 
projections such that Id = pj +---+ pn. Show that pjpy = 0 whenever j # k. 


3.6.3. The adjoint of a linear transformation 


Exercise 3.23. Let V be a finite-dimensional inner product space and let 
f,g :V— YV be orthogonal projections. Show that the following conditions 
are equivalent. 


(a) ran f C rang 
(b) of =f 
(c) fo=f 


Exercise 3.24. Let V be an inner product space and let f,g : V > V be 
orthogonal projections. If ran f C rang, show that g — f is an orthogonal 
projection and ran(g — f) = (ran f)+ Nrang. 


Exercise 3.25. Let V be an inner product space and let f : V > V be a linear 
operator. If f is invertible and self-adjoint, show that f~! is self-adjoint. 


Exercise 3.26. Let V be a finite-dimensional inner product space and let f : 
VY V be a self-adjoint operator. Show that ||(A — f)x|| > | Im A|||x|], where 
A€ Kand xe V. 


Exercise 3.27. Let V and W be finite-dimensional inner product spaces and 
let f : V— W be a linear transformation. If f is injective, show that f* f is an 
isomorphism. 


Exercise 3.28. Let V and W be inner product spaces and let f: V—- W bea 
linear transformation. If f is surjective, show that f f* is an isomorphism. 


Exercise 3.29. Let V and W be inner product spaces and let v € V and w € W. 
Show, with the notation from Exercise 3.8, that we have (w © v)* =v@w. 


Exercise 3.30. Let V and W be inner product spaces and let v € V and w € W. 
If f : W > V is a linear transformation, show that (w® v)o f=w® f*(v), 
where ® is defined as in Exercise 3.8. 


Exercise 3.31. Let V be a finite-dimensional inner product space and let f : 
Y — VY be a linear operator. Show that f = 0 if and only if f*f =0. 


Exercise 3.32. Let V be an inner product space and let f,g: V — V be linear 
operators. If f is self-adjoint, show that g* fg is self-adjoint. 


Exercise 3.33. Let V be a finite-dimensional inner product space and let f : 
VY — VY be a linear operator. Show that Id+/*f is invertible. 


Exercise 3.34. Let V be a finite dimensional inner product spaces and let 
f,g9:V—Y be self-adjoint operators. Show that fg is self-adjoint if and only 


if fg =f. 


216 Chapter 3: Inner Product Spaces 


Exercise 3.35. Let V be an inner product space and let f,g : V > V be 
orthogonal projections. Show that fg is an orthogonal projection if and only if 


fg =af. 


Exercise 3.36. Let V be an inner product space and let f,g : V > V be 
orthogonal projections. If fg is an orthogonal projection, show that ran fg = 
ran f Mrang. 


Exercise 3.37. Let V be a finite-dimensional inner product space and let f : 
Y — VY be a linear operator. If f* f is an orthogonal projection, show that f f* 
is an orthogonal projection. 


Exercise 3.38. Let V be an n-dimensional complex inner product space and 
let {aj,...,a,} be an orthonormal basis of Y. For a linear operator f :V > V 
we define the trace of f, denoted by tr f, as 


ir f= dF (ay),a). 


Show that tr f does not depend on the choice of the orthonormal basis {a),..., 
a,}, that is, for any two orthonormal bases {a),...,an} and {bi,...,b,} in V 


we have jai (f (ay), ay) = jai (f (by), by). 


Exercise 3.39. Let A = (a;;) be a matrix from Mn xn(C) and let f :C” > C” 
be the linear operator defined by f(v) = Av. Show that tr f = a1, +---+ nn. 
(See Exercise 3.38 for the definition of tr f.) 


Exercise 3.40. Let V be an n-dimensional inner product space and let x,y € V. 
Show that tr(x @ y) = (x,y). (See Exercise 3.38 for the definition of tr f and 
Exercise 3.8 for the definition of x @ y.) 


Exercise 3.41. Let V be an inner product space and let f,g: V — V be linear 
operators. Show that tr(fg) = tr(gf). (See Exercise 3.38 for the definition of 


tr f.) 


Exercise 3.42. Let V be an inner product space. Show that the function 
s: L(V) x L(V) > K defined by s(f,g) = tr(g*f) is an inner product. (See 
Exercise 3.38 for the definition of tr f.) 


Exercise 3.43. Let V be an inner product space and let f,g : V > V be 
self-adjoint operators. If f2g? = 0, show that fg = 0. 


3.6.4 Spectral theorems 


Exercise 3.44. Let V be an inner product space and let f : V — V be an 
invertible linear operator. If \ is an eigenvalue of f, show that + is an eigenvalue 
of fot. 


3.6. EXERCISES 217 


Exercise 3.45. Let V be a finite-dimensional inner product space and let f,g: 
Y — Y be linear operators. Show that, if Id—fg is invertible, then Id—g/f is 
invertible and (Id—gf)~! = Id+g(Id —fg)~1f. Then show that, if \ 4 0 is an 
eigenvalue of fg, then A is an eigenvalue of gf. 


Exercise 3.46. Let V be an inner product space and let f : V > V be a linear 
operator. Let g = $(f + f*) and h = +(f — f*). Show that g and h are 
self-adjoint and f = g + ih. Show also that f is normal if and only if gh = hg. 


Exercise 3.47. Let V be an inner product space and let f : V > V be a normal 
operator. If {ay,...,a,} is an orthonormal basis of V and \1,...,An are the 
eigenvalues of f, show that )7%_, |Aj/? = 05, Ilf(a,)II?- 


Exercise 3.48. Let V be an inner product space and let f : V > V be a linear 
operator. Using Exercise 3.19 show that, if the operator f is normal, then there 
is an orthonormal basis {vi,...,Vn} of V consisting of eigenvectors of V. 


Exercise 3.49. Let V be a finite-dimensional inner product space and let f : 
VY — V be a linear operator. Show that if f f*f = f*ff, then (ff* — f*f)? =0 
and that f is normal. 


Exercise 3.50. Let V be a finite-dimensional inner product space and let f : 
Y — YV be a linear operator. Show that f is normal if and only if there is a 
polynomial p such that p(f) = f*. 


Exercise 3.51. Let V be a finite-dimensional inner product space. Show that 
the function S : V x V — V x V defined by S(x,y) = (—y,x) is a unitary 
operator and that S$? = Idy yy. 


Exercise 3.52. Let V be a finite-dimensional inner product space and let f : 
Y — V be a linear operator. If 


Tf ={(x,f(x))e@VxV|xeV} and If* ={(x, f*(x)) €VxV|[xe V}, 
show that (I'(f))+ = S(I(f*)), where S' is defined in Exercise 3.51. 


Exercise 3.53. Let V be an n-dimensional inner product space and let f : V > 
Y be a linear operator. Show that f normal if and only if there is a unitary 
operator g such that fg = f*. 


Exercise 3.54. Let V be a finite-dimensional inner product space and let f : 
Y — V be a linear operator. If f is positive, show that g* fg is positive. 


Exercise 3.55. Let V be an n-dimensional inner product space and let f : V > 
Y be a self-adjoint operator. Show that there are positive operators g and h 
such that f=g—h. 


Exercise 3.56. Consider the operator f € L(C?,C?) defined as f(x) = Ax, 
| 13. 5i 
where A = 


_5i af Show that f is positive and determine its spectral de- 
composition. 


218 Chapter 3: Inner Product Spaces 


Exercise 3.57. Let V and W be inner product spaces and let f : V—- W bea 
linear operator. Show that f is an isometry if and only if there are orthonormal 
bases {v1,--.,Vn} and {wi,...,Wn} in V such that f = >0%_, w; @ vj, where 
® is defined as in Exercise 3.8. 


3.6.5 Singular value decomposition 


Exercise 3.58. Let V be a finite-dimensional inner product space and let f : 
VY — V be a linear operator. Show that, if f* f is the projection on a subspace 
U, then f is a partial isometry with initial space U/. 


Exercise 3.59. Consider the operator f € £(C*,C?) defined as f(x) = Ax, 
where A = E : 


: if Determine the singular value decomposition of f. 


Exercise 3.60. Let V and W be finite dimensional inner product spaces. Show 
that for every nonzero linear transformation f : V — W there are positive 
numbers o1,...,0,, orthonormal vectors vj,...,V;, € V, and orthonormal vec- 
tors wi,...,w, € W such that f = )%\_, 0;wj @ vj, where @ is defined as in 
Exercise 3.8. 


Exercise 3.61. Let V and W be finite-dimensional inner product spaces and 
let f : V > W be a linear operator. If f = )0_, ajwj @ vj, where {v1,..., vr} 
are orthonormal vectors in V, {wi,...,w,} are orthonormal vectors in W, and 
Q1,...,@, are positive numbers such that a; >--- > a, (and @ is defined as in 
Exercise 3.8), show that f* f = 0" _y |a;|?vj; @ vj and consequently vi,...,Vr 
are eigenvectors of f* f with corresponding eigenvalues |a,|?,..., |a,|?. 


Exercise 3.62. Let V and W be finite dimensional inner product spaces and 
let f : V + W be a linear transformation. Let f = }{\_,0;wj @ vj be the 
singular value decomposition of f from Exercise 3.60, where {vi,...,v,} are 
orthonormal vectors in Y, {wi,...,w,} are orthonormal vectors in W, and 
01,---,0,r are positive numbers such that 0; >--- > o,. Show that ran f = 
Span{wi,...,w,}. 


Exercise 3.63. Let V and W be finite dimensional inner product spaces and let 
f:V—-W bea linear transformation. Let f = Mepek aj;w;®vj, be the singular 


value decomposition of f from Exercise 3.60, where {vi,...,v,} are orthonormal 
vectors in V, {wi,...,w,} are orthonormal vectors in W, and o1,...,0, are 
positive numbers such that 0, > --- > o,. Let {vi,...,Vr,---,Vn} be an 


orthonormal basis of V. Show that ker f = Span{v,+41,...,Vn}- 


Exercise 3.64. Let V and W be finite-dimensional inner product spaces and 
let f : V + W be a linear transformation. Let f = eg aj;w; ®v; be the 
singular value decomposition of f from Exercise 3.60, where {vi,...,v,} are 
orthonormal vectors in V, {w1,...,w,} are orthonormal vectors in W, and 
01,---,0,r are positive numbers such that a, >--- > o,. Show that ran f* = 
Span{vj,...,v,}. 


3.6. EXERCISES 219 


Exercise 3.65. Let V and W be finite-dimensional inner product spaces and let 
f:V—W bea linear transformation. Let f = eee oj;w;®vj; be the singular 


value decomposition of f from Exercise 3.60, where {vi,...,v,} are orthonormal 
vectors in VY, {wi,...,w,} are orthonormal vectors in W, and o1,...,0, are 
positive numbers such that 0; > --- > o,. Let {wi,...,Wr,..-,;Wm} be an 


orthonormal basis of W. Show that ker f* = Span{w,+1,...,Wm}. 


Exercise 3.66. Let V and W be finite-dimensional inner product spaces and 
let f : V > W be a linear transformation. Let f = ae aj;w; ®v; be the 


singular value decomposition of f from Exercise 3.60, where {vi,...,v,} are 
orthonormal vectors in Y, {wi,...,w,} are orthonormal vectors in W, and 
O1,.--,0r are positive numbers such that a, >--- > o,. Show that, if a linear 


transformation f* : W — YV is such that ft f = proj,ans- and f* = 0 on 
(ran f*)*, then ft =>"_, Vi ® w;. 


Exercise 3.67. Let V and W be finite-dimensional inner product spaces and 
let f : V > W be a linear transformation. Let f = ey aj;w; ® vj; be the 


singular value decomposition of f, from Exercise 3.60, where {vi,...,v,} are 

orthonormal vectors in VY, {wi,...,w,} are orthonormal vectors in W, and 

01,..-,0y are positive numbers such that a; >--- > a,. Let ft = Pe, +Vv;® 
oe 


w; as in exercise 3.66. Show that f f+ is the projection on ran f and Idy —f ft 
is the projection on ker f*. 


Exercise 3.68. Let V and W be finite-dimensional inner product spaces and let 
f:V—W bea linear transformation. Let f = ee a;w;®vj; be the singular 


value decomposition of f from Exercise 3.60, where {vi,...,v,} are orthonormal 

vectors in V, {wi,...,w,} are orthonormal vectors in W, and o1,...,0, are 

positive numbers such that 0, > --- > o,. If ft = Dee: sv; ® w; (as in 
2 


Exercise 3.66), show that ft fft = ft. 


Exercise 3.69. Let V and W be finite-dimensional inner product spaces and 
let f : V + W be a linear transformation. Let f = apa aj;w; ®v; be the 
singular value decomposition of f from Exercise 3.60, where {vi,...,v,} are 
orthonormal vectors in Y, {wi,...,w,} are orthonormal vectors in W, and 
01,---,0p are positive numbers such that 0, >--- >o0,. If f is injective, show 
that f*f is invertible and (f*f)~'f* = f+, where f* is defined in Exercise 
3.66. 


Exercise 3.70. Let V and W be finite-dimensional inner product spaces and 
let f : ¥V— W bea linear transformation. Show that (f*)t = (ft)*, where f* 
is defined in Exercise 3.66. 


Exercise 3.71. Let V be a finite-dimensional inner product space and let f : 
Y — VY be a linear operator. Obtain, using Exercise 3.12 and without using the 
proof of Theorem 3.5.2, the following form of singular value decomposition: 

If dim V = n, then there are orthonormal bases {v1,..., vn} and {u,...,Un} 
of Y and nonnegative numbers oj,...,@, such that f = 3 aj;U; ® vj. 


220 Chapter 3: Inner Product Spaces 


Exercise 3.72. Let V be a finite-dimensional inner product space and let f : 
Y —> VY be a linear operator. Using Exercise 3.71 show that there is an isometry 
g such that f = g|f|. Note that this is a form of the polar decomposition of f. 


Exercise 3.73. Let V be a finite dimensional inner product space and let 
f :V— VY be a linear operator. Using Exercise 3.71 show that there is an 


isometry g: V + V such that f = /f f*g. 


Exercise 3.74. Let V and W be finite dimensional inner product spaces and let 
f:VY—W bea linear transformation. Let f = ey aj;w;®vj; be the singular 
value decomposition of f from Exercise 3.60, where {vi,...,v,} are orthonormal 
vectors in Y, {wi,...,w,} are orthonormal vectors in W, and o1,...,0, are 
positive numbers such that 0, > --- > o,. Let {vi,...,Vr,---,Vn} be an 
orthonormal basis of V. If f* = ){\_, s-vj ® wj, where ® is defined as in 
Exercise 3.8, show that every least square solution x of the equation f(x) = b 
is of the form x = f*(b) + @p4iVrq1 +++ + 2nVn where @,41,---,2n € K are 
arbitrary. Moreover, there is a unique least square solution of minimal length, 
which is x = f*(b). 


Chapter 4 


Reduction of 
Endomorphisms 


Introduction 


The main topic of this chapter is the following question: Given an endomorphism 
f on a finite-dimensional vector space V can we find a base of V such that 
the matrix of f is simple and easy to work with, that is, diagonal or block- 
diagonal. This will help us better understand the structure of endomorphisms 
and provide important tools for applications of linear algebra, for example, to 
solve differential equations. 

At the beginning of the chapter we discuss alternating multilinear forms and 
determinants of endomorphisms which will give us a practical way of determin- 
ing the diagonal and block-diagonal matrices of an endomorphism. 

Our presentation of determinants is self-contained, that is, it does not use 
results on determinants from elementary courses. 

In the context of this chapter it is customary to use the name endomorphisms 
instead of linear operators. 


4.1 Eigenvalues and diagonalization 


4.1.1 Multilinear alternating forms and determinants 


At the beginning of Chapter 3 we introduce bilinear forms. They are defined as 
functions f : YV x VY > K that are linear in each variable, that is, 


the function f, : YV — K defined as fx(y) = f(x,y) is linear for every 
x € V and 


the function fy : V > K defined as fy(x) = f(x,y) is linear for every 
yey. 


221 


222 Chapter 4: Reduction of Endomorphisms 


n times 


=e, 
A similar definition can be given for any function from V” = V x--- x V to K. 


Definition 4.1.1. Let V be a vector space. A function E : Y” > K is 
called an n-linear form or a multilinear form if for every 7 € {1,...,n} 
and every X1,.-.,Xj—1,Xj41,---;Xn © V the function 


defined as 
sg Sea aicaton Sy) 


is a linear form. 


Example 4.1.2. The function E : K” —> K defined as 
ID Giigann qin) Cin) aenten 


is an n-linear form for any c € K. 

This example can be generalized in the following way. Let V be a vector 
space and let f; : V > K be a linear function for j € {1,...,n}. Then the 
function E : V” — K defined as 


ID(CSi dacy2n)) = IOS) coat esa) 


is an n-linear form. 


Definition 4.1.3. Let V bea vector space. An n-linear form E : ¥" > K 
is called an alternating n-linear form (or alternating multilinear form) if 


E(x1,..-,Xn) =0 


whenever x; = Xp for some j # k. 


The following property of alternating multilinear forms is often used in cal- 
culations. It is equivalent to the condition in the definition of alternating multi- 
linear forms. 


4.1. EIGENVALUES AND DIAGONALIZATION 223 


Theorem 4.1.4. Let V be a vector space and let E : V" —+ K be an 
alternating multilinear form. If1<j7<k<n, then 


Bei -cke > Xj—-1, Xk, Xj4+1,--+>Xk-1,%j,Xk41-- x= 


rN 901 ek a Ge ey Ee) 


for all x1,...,Xn EV. 


Proof. Since 


0 = E(x1,...,Xj-1, Xj) + Xk, Xjt1,---,Xk—-1,Xj + Xk, Xb41,---)Xn) 

= E(x, see) Mj—1,%j,Xjt1,---,Xk-1,%j,Xk4+1,-- ka) 
Tr E(x, vee) Mj—1,%j,Xj4t1,---,Xk-1, Xk, Xk41,--- ,Xn) 
i E(x, see) Mj—1,Xk,Xj41,---,Xk-1,%j,Xk4+1,--- ee 
+ E(x1,...,Xj—1,Xk,Xj41,---,Xk-1, Xk, Xk+1---Xn) 

= E(x, see) Mj—1,%j,Xj41,---,Xk-1, Xk, Xk+1,-- .;Xn) 
Tr E(x, see) Xj—1,Xk,Xj41,---,Xk-1,%j,Xk4+1,--- ,Xn); 

we have 
E(x, see) Xj—-1,Xk,Xj41,---,Xk-1,%j,Xk41-- a) 
=— E(x, see y Xj—1,%j,Xj41,-+-,Xk-1, Xk, Xk41-- .Xn). 


O 


The following three examples indicate that there is a connection between 
alternating forms and determinants as defined in elementary courses. The full 
scope of that connection will become clear later in this chapter. In these ex- 


amples the determinant of a matrix ap € Mo 2(K) is defined as usual by 
y 6 


det ° A = ad — By. 


Example 4.1.5. Let V be a vector space and let EF : Vx V — K be an 
alternating bilinear form. Show that for every v1, v2 € V anda,@,7,6 € K 
we have 


E(avi + 8v2, vi + Ove) = det i 4 E(v1, v2). (4.1) 


224 Chapter 4: Reduction of Endomorphisms 


Proof. For every vi, v2 € VY and a, 8,7,6 € K we have 


E(avi + Bv2,yv1 + 6v2) = E(avi, yv1) + E(avi, dv2) 

+ E(8v2, v1) + E(8vo, dv2) 
= ayE(v1, v1) + adE (vi, v2) 

+ ByYE(v2, v1) + BOE(v2, v2) 
= adE(v1, v2) + GYE(va, v1) 

= adE(v1, V2) — BYE(V1, v2) 

= (ad — By) B(v1, v2) 


= det ° E(v1, v2). 


Note that, if E satisfies (4.1) for every v1, v2 € V and a, 8,7,6 € K, then 
E is alternating. oO 


Example 4.1.6. Let V be a vector space and let E: V x V > K be an alter- 
nating bilinear form. Show that for every x,y,z € V and aj1, @21, 431, 412, A22, 
a32 € K we have 


E(a11X + a@e1y + 4312, d12X + a22y + A322) 


= det es os E(x, y) + det i a E(x, z) 
421 422 431 432 


a21 422 
+ det E(y,z). 
be | (y,2) 
Solution. 
E(a11X + dary + a31Z, a12X + a22y + 4322) 
= E(a11x, a22y) + E(aary, a12x) + E(a11x, a322) 
+ E(a31Z, a12x) + E(aa1y, a32z) + E(a312Z, a22y) 
= E(ay1x + @a1y, a12x + ag2y) + E(ai1x + a31Z, a12x + 4322) 


+ E(aa1y + 431%, a22y + 4322) 


= det ie A E(x, y) + det Ee | E(x, z) 
421 422 431 432 


a21 a22 
+ det is 2 E(y,z) 


4.1. EIGENVALUES AND DIAGONALIZATION 


225 


Example 4.1.7. Let x,y,z be vectors in a vector space V and lett FE: Vx V x 


Y — K be an alternating multilinear form. Show that 


E(a11X + dary + 4312, a12X + a22y + 32%, 413X + de3y + 4332) 


= (on det ie aa — a2, det be a + a3 det Ee 13 
33 


‘)) E(x,y,2) 


a32 a32 433 a22 a2 


a31 


=_— (a det is a — Ag2 det ee oe + a32 det Be a) E(x, y, Z) 
33 a21 


a31 433 | 


423 


a21 422 a41 12 a11 12 
= | ay3 det — a93 det + a33 det E(x, y,z 
( a be a - | sa be |) ( - ) 


a31 432 


for every 411, 421, 431, @12, 422, @32, 413, 423,433 € K. 


Solution. We prove the second equality. The other equalities can be proven in 


the same way. 


We apply the result from Example 4.1.6 to the function G@: Vx V > K 


defined by 
G(s, t) => K(s, ay2xX + a22Y + 432Z, t), 


where x,y,z € V are arbitrary but fixed, and obtain 


E(ai1X + Gary + a31Z, a12X + ao2y + 432%, 13 + a23y + a33Z) 


a a 
= det | ie a E(x, aj2X + d22y + 432%, y) 
421 423 


@i1 413 
+ det E(x, a12X + Go2y + 432%, Z) 
a31 433 


a2, a 
+ det | 2 ‘| E(y, a12X + az2y + 432%, Z) 
a31 433 


G11 413 G11 413 
= det E(x, a32z, y) + det E(x, aa2y,z 
ee a (x, a322, y) Ee a (x, azay, 2) 


a21 423 
+ det iB | E(y, a12X, Z) 


= azo det be | E(x, Z, y) + aa det ie fel E(x, y, Z) 
+aj2 det Be ae E(y,x,z 

Sel ee a les acd see ie a ee) 
—a42 det ba vel E(x, y,z 


a21 a31 


a a a a a a 
= — ( ago det iy pels a22 det - 18 + ayo det os Be 
a23 433 


431 433 


226 Chapter 4: Reduction of Endomorphisms 


Theorem 4.1.8. Let V be a vector space and let E: V" + K be an 
alternating multilinear form. If x1,...,Xn € V are linearly dependent, 


then E(x1,...,Xn) = 0. 


n 
Proof. Without loss of generality, we can assume that x; = > a;x;. Then 
j=2 


E(x1,..-,Xn) =E pce re = GB sisnga) =O, 


because E is alternating. oO 


Theorem 4.1.9. Let VY be a vector space and let E: VY" > K be a 
multilinear form. For any permutation 0 € G,, the functionG: V" > K 
defined by 


G(x1,.-.,Xn) = > €(o)E(Xa ys +--+ Xo(n)) 


cEGy 


is an alternating multilinear form. 


Proof. It is easy to see that G is a multilinear form. 


Now suppose that x; = x, for some distinct j,k € {1,...,n}. Let rT =aj% € 
6, that is the transposition such that t(j) = k, T(k) = j, and r(l) = 1 for 
any | € {1,...,n} different from j and k. First we note that if o is an even 


permutation then ro is an odd permutation and the function s : E, > On 
defined by s(o) = Ta is a bijection. 


We have 
G(x, aes Xn) = S- (7) E(Xo(1); rie Xen)! 

cTEGn 

= S- (a) E(Xo(1); tee eon) Tr S- (7 )E(xo(1), heey Xe(n)) 
TEEN, ae€On 

= SO e(o)E(xo(1),---)Xo(n)) + 45 €(r0)E(K,o(1),-- +) Xro(n)) 
oEEn oeEn 

= So a BGetijse2 em y— Y (eB Gpetajys+ <5 rein))- 
oEEn oEEn 


Now we consider three cases: 

Case 1: If o(l) 4 j and o(l) £ k, then ro(l) = o(l). 

Case 2: If o(1) = j, then to(l) = T(7) = k and xj = XQ) = Xk = Xr (1). 
Case 3: If o(1) =k, then ro(l) = r(k) = 7 and xp = Xg(1) = Xj = Xro(1)- 


4.1. EIGENVALUES AND DIAGONALIZATION 227 


Consequently 
G(x, cl Xn) = S- (a) E(Xo(1); ineg Xo(n)) — S- (7) E(X,o(1); oe aia =0 
oEEn oEEn 
and thus G is an alternating multilinear form. O 


Theorem 4.1.10. Let V be vector space and let E : VY" + K be an 
alternating n-linear form. Then 


for anyo € Gy and x,...,Xn€ V. 


Proof. Since, by Theorem 4.1.4, we have 
E(t) 52023 XG) = ~B Gti 1. 5 Rx) 


for every transposition 7, the result follows from Theorem 5.1 in Appendix 
A. O 


Theorem 4.1.11. Let V be vector space and let E : VY" + K be an 
alternating n-linear form. If 


Xj = A1jV1 +++ + GnjVn, 
where V1,.--,Vn;X1,-+-;Xn © V, Gry € K, and j,k € {1,...,n}, then 


BG. So €()asaya-* conn] B(Vi,<<55. Va): 


cEGyn 


Proof. Since E is alternating n-linear, we have 


E(x, wee iXn) = S- Go(1),1° °° Ge(n),nE(Ve(1); see »Vo(n)): 
cGy 


Consequently, by Theorem 4.1.10, 


E(x1,...,Xn) = ( a des 0) E(vi,---; Vn): 


Theorem 4.1.12. Let V be an n-dimensional vector space and let 


E: ¥” > K be a nonzero alternating n-linear form. Then vectors 
Vi,---;Wn € V constitute a basis of V if and only if E(vi,...,Vn) #0. 


228 Chapter 4: Reduction of Endomorphisms 


Proof. If {vi,...,Vn} is a basis of V and E(vi,...,Vn) = 0, then E = 0 by 
Theorem 4.1.11. Consequently, if {v1,...,vn} is a basis and E 4 0, then 
E(v1,.--,Vn) #0. 


If the vectors v1,...,Vn are linearly dependent, then E(vi,...,vn) = 0, by 
Theorem 4.1.8. Consequently, if E(vi,...,Vn) 4 0, then vi,...,v, must be 
linearly independent, and thus {v1,..., Vn} is a basis of V. Oo 
Example 4.1.13 (Cramer’s rule). Let V be a vector space and let {vj,...,Vn} 


be a basis of V. If D: V” > K is a nonzero alternating multilinear form and 
a=%1V1it+°::+2nVn, 
then 


D(v1, +++) Vj-1,4, Vj41,--- Ap) 
D(v1, Roe Vn) 


i= 


for every j € {1,...,n}. 


Proof. For every j € {1,...,n} we have 


D(vi,- ++, Vj-1,4, Vj41,.--- Avi) 
= D(vi,. ++, Vj-1,%71V1 tore + 2nVn,Vj+1; een aha) 
= x1 D(v1, eee > Vj—-1, V1, Vj+1, een BVirn) 
apees +2,;D(v1, seeyVj-1, Vj, Vj415--- VN) 
+>. +a,D(v1, beers >» Vj—-1, Vn; Vj+1; ace RV) 
= xj;D(v1, see) Vj-1, V7, Vj415--- Vani 
This gives us the desired equality. O 


Theorem 4.1.14. Let {vi,...,Vn} be a basis of a vector space V. There 
is an unique alternating n-linear form 


such that Dy, 


Proof. Let {v1,...,Vn} be a basis of V. For x = ayvi +-+-+@nVpn and j € 
{1,...,n} we define 


ly, (x) = 4, (a1vi +++: + nvr) = aj. 


Clearly, the function E,,,.y,,: ¥"” — K defined by 


Peer) 


Ey, ,..vn(X1, oe oxy) = Le (x1) oS ly,, (Xn) 


4.1. EIGENVALUES AND DIAGONALIZATION 229 


is n-linear. According to Theorem 4.1.9 the function Dy, vy, : V" — K defined 
by 


Dy,,....vn (X1, +++) Xn) = Ss e(a)ly, (X51) woaby,, (Xo(n)) 


cEGn 
is an alternating n-linear form such that Dy,,...v,,(V1,---,Vn) = 1. The unique- 
ness is a consequence of Theorem 4.1.11. O 


Note that the sums 


S- €(7)ao(1),1 “** Ao(n),n and S- €(7)@1,0(1) *** @n,a(n) 


ocGn cGy, 
are equal. Indeed, if aj, € K for j,k € {1,...,n} and o,7 € Gy, then 
Qo(1),1°°* @o(n),n = Ger(1),7(1) °° * Gor (n),r(n) 
and consequently 
Go(1),1°°* @o(n)n = Aaa-1(1),0-1(1) °° * aa} (n),o-1(n) = A1,0-1(1) °° An,o-1(n)- 


Hence 
DE (o)a6(2),1 °° Go(n)n = DY (0) a1,0-1(1) +++ n,o-1(n) 
cEGy 


(0 *)ay,5-1(1) "°° An,o-1(n) 


€(7)a4,6(1) “** An,o(n): 


cEGy 
cEGy 
cEGy 


Definition 4.1.15. Let A be an n x n matrix with entries a;,. The 


S- €(O)ao(1),1 “°° Ag(n),n 


number 


cEGy 


is called the determinant of A and is denoted by det A. 


Note that from the calculations presented before the above definition it fol- 
lows that det A = det A’. 


It is easy to verify that our definition of the determinant agrees with the 
familiar formulas for 2 x 2 and 3 x 3 matrices: 


a ait G12) _ 
et = 11422 — A12421 
421 422 


230 Chapter 4: Reduction of Endomorphisms 


and 


411 412 413 
det |@21 G22 23] = 11422433 + a12423031 + 413021432 


431 432 433 
— 413422431 — 411423432 — A12421433. 


Theorem 4.1.16. Let V be an n-dimensional vector space and let D : 
vy" — K be a nonzero alternating n-linear form. For every alternating 
n-linear form E: V" + K we have 


E=aD 


for some unique a € K. 


Proof. Let {vi,...,Vn} be a basis in V. For every x1,...,Xn € V we have 
Xj = A1jV1 t+ + GnjVn, 


where aj, € K and j,k € {1,...,n}. Now, using Theorem 4.1.11, we obtain 


E(x1,...,Xn) = ( S> e(o)ao(1),1 aa. E(vi,.--,Vn) 


cEGy 
D(vi,.--,Vn) 
= ( S- €(7)@o(1),1 ato. E(vi, v0) pet) 
cEGy 
E(vi, migks Vn) 
Awe foe a cau NG o “"* Go(n),n D pore Yn 
D(v1,---,;Vn) 2 o}ao(aya**“de(n), M4 vn) 
E(vi, »Vn) 
= D : n 
D(vi, »Vn) (a, 7 ) 
This means that E = aD where a = pet. 
Note that since E(x1,...,Xn) = @D(x1,...,xn) for every X1,...,Xn € V, 
the constant a does not depend on the choice of a basis in V. Now the uniqueness 
of a is immediate. O 


Theorem 4.1.17. Let V be an n-dimensional vector space and let f : 
Y— V be an endomorphism. There is a unique number a € K such that 
for every V1,...,Vn € V and every alternating n-linear form E: V" > K 
we have 


E(f(vi),..-,f(vn)) = @E(v1,..., Vn). 


4.1. EIGENVALUES AND DIAGONALIZATION 231 


Proof. Let D : ¥V” — K be a nonzero alternating n-linear form. The function 
F: VY” > K defined by 


F(vi,..-, Vn) = D(f(v1),..-, f(vn)) 


is an alternating n-linear form. Consequently, by Theorem 4.1.16, there is a 
number a € K such that 


D(f(v1),---,f(vn)) = F(v1,---, Vn) = aD(v1,.-.., Vn). 


Now, applying Theorem 4.1.16 to an alternating n-linear form E : V” + K we 
obtain a number 6 € K such that E = GD. Hence 


E(f(vi),---,f(Vn)) = GBD(f(v1),.--, f(wn)) = BaD(v1,..., Vn) = AE(v1,..., Vn). 


It is clear that @ is unique. O 


Definition 4.1.18. Let V be an n-dimensional vector space and let 
f:V—Y be an endomorphism. The number a € K such that for every 


Vi1,---,Vn € V and every alternating n-linear form E : V” — K we have 
E(f(vi),.--;f(vn)) = @E(v1,..., Vn) is called the determinant of f and 
is denoted by det f. 


Using the notation from the above definition we can write that, if f is an 
endomorphism on an n-dimensional vector space V, then 


E(f(v1),--.,f(vn)) = det f E(vi,..-,Vn) 


for every V1,---,Vn € V and every alternating n-linear form E: ¥” > K. 


Example 4.1.19. Let V be a 2-dimensional vector space and let f : V— V 
be an endomorphism. If {v1, v2} is a basis of V such that 


f(vi) =avi+6ve and f(v2) = yv1+dve 

show that det f = ad — By. 

Proof. Let E be an arbitrary alternating bilinear form E: V x ¥V > K. Then 
E(f(v1), f(v2)) = E(avi + Bv2,yvi + bva). 


Now we continue as in Example 4.1.5. O 


232 Chapter 4: Reduction of Endomorphisms 


Theorem 4.1.20. Let V be an n-dimensional vector space and let f and 
g be endomorphisms on V. Then 


det(gf) = det g det f. 


Proof. Let D: V" — K be a nonzero alternating n-linear form and let x1,..., 
Xn, € V. Then 


D(gf(x1), aa ,9f (Xn)) — det(gf)D(x1, pine Xn) 
and 
D(gf(x1),---,9f(Xn)) = det g D(f(x1),..., f(Kn)) = detg det f D(x1,...,xXn). 
Oo 


Theorem 4.1.21. Let V be an n-dimensional vector space and let f : 
VY > V be an endomorphism. Then f is invertible if and only if det f 4 0. 


If f is invertible, then det f~' = 


det f 


Proof. Let D : ¥" — K be a nonzero alternating n-linear form. If f is not 
invertible, then there is a nonzero vector x; such that f(x1) = 0. If {x1,...,xn} 
is a basis of V, then we have 


D(f(x1),.-.,f(Xn)) = det f D(x1,...,xn) = 0. 


Since {x1,...,Xn} is a basis of V, we have D(x1,...,xn) # 0. Consequently, 
det f = 0. This shows that, if det f # 0, then f is invertible. 
Conversely, if f is invertible, then 


1 = det(Idy) = det(f~!f) = det f—' det f. 


it 
Thus det f 4 0 and we have det f~* = det} 


Corollary 4.1.22. Let V be a finite dimensional vector space and let f 
and g be endomorphisms on V. If f is invertible, then 


det(f~'gf) = det g. 


Proof. According to the previous two results we have 


1 


det(f~'gf) = det f—' det g det f = det f 


det g det f = det g. 


4.1. EIGENVALUES AND DIAGONALIZATION 233 


Lemma 4.1.23. Let V, and V2 be finite dimensional vector spaces and 
let V=V, Ovo. If f: Vi > Vy is an endomorphism and g:V— V is 
the endomorphism defined by 


g(vi + v2) = f(vi) + v2 


for all v, © Vy and v2 € V2, then 


det g = det f. 


Proof. Let {x1,...,Xn} be a basis of the vector space V such that {x1,...,xXp} 
is a basis of the vector space V, and {x,41,...,Xn} is a basis of the vector space 
Y2. Let D: V" + K be a nonzero alternating n-linear form. Then 


D(g(x1),---,9(Xn)) = det g D(x1,..., Xn). 
Now, if D; : VP > K is the alternating p-linear form defined by 
Di(vi,.--,Vp) = D(v1,.--, Vp, Xp4i,---,Xn); 

then 

Di(f(x1),..-,f(Xp)) = det f Di(xi,...,xp) = det fD(x1,...,Xp,Xp4i,-.-,Xn)- 

Hence det g = det f because 

D(g(x1),---9(%n)) = D(f(x1),.--, F(&p), Xppas-- ++ Xn) = Di(f(x1),---, f(Xp))- 
O 


Theorem 4.1.24. Let V, and V2 be finite dimensional vector spaces and 
let V=V, @Vo. If fi : Vi 7 Vi and fo : V2 4 Vo are endomorphism 
and f:V— YV is the endomorphism defined by 


f(v1 + v2) = fi(vi) + fave) 


where v1 € Vi and v2 € vo, then 


det f = det fy det fo. 


Proof. Let gi: ¥V > V and gz: V > V be defined as 
gi(vi + V2) = fi(vi) +v2 and ge(vi + v2) = vi + fo(v2) 


for all v; € Vy and v2 € Va. Then f = gige and thus det f = det g; det go, 
by Theorem 4.1.20. Now from Lemma 4.1.23 we have detg,; = det f; and 
det g2 = det fo, which gives us det f = det f; det fo. oO 


234 Chapter 4: Reduction of Endomorphisms 


4.1.2 Diagonalization 


In the remainder of this chapter it will be convenient to identify a number a € K 
with the operator aId. This convention is quite natural since (a Id)x = ax, so 
the a on the right hand side can be interpreted as a number or an operator. 

Eigenvalues and eigenvectors were introduced in Chapter 3 in the context of 
operators on inner product spaces, but the definitions do not require the inner 
product. For convenience we recall the definitions of eigenvalues, eigenvectors 
and eigenspaces. 


Definition 4.1.25. Let V be a vector space and let f : V > V be an 
endomorphism. A number X € K is called an eigenvalue of f if the 
equation 


f(x) = Ax 


has a nontrivial solution, that is, a solution x 4 0. 


The following theorem is useful when finding eigenvalues of endomorphism. 


Theorem 4.1.26. Let V be a vector space and let f : V > V be an 
endomorphism. Then 


d is an eigenvalue of f if and only if det(f — A) = 0. 


Proof. The equivalence is a consequence of Theorem 4.1.21. Indeed, the equa- 
tion f(x) = Ax has a solution x 4 0 if and only if the equation (f — A)(x) =0 
has a solution x 4 0, which means that the linear transformation f — X is not 
invertible and this is equivalent to det(f — A) = 0, by Theorem 4.1.21. O 


Definition 4.1.27. Let V be a vector space and let f : V > V be an 
endomorphism. The polynomial 


er (t) = det(f — t) 


is called the characteristic polynomial of f and the equation 


det(f — t) =0 


is called the characteristic equation of f. 


4.1. EIGENVALUES AND DIAGONALIZATION 235 


Example 4.1.28. Let f : R? — R® be the endomorphism defined by 
G 3 1 2) |z 
fliy = les ale 
z Aa ez 


Solution. Let D : R® x R® x R? > R be a nonzero alternating 3-linear form. 


‘a }) o-9([))-v-o(fe)) 


Calculate cy. 


Now we proceed as in Example 4.1.7 and get 


EEL) (EB) 


Hence 
cr(t) = (2 —t)(t? — 8 + 12) = (2—#)?(6 — 2). 


0 
0 


Example 4.1.29. Let f : C? + C? be the endomorphism defined by 


(LoD) = [8 3] bd 


Calculate cf and the eigenvalues of f. 


236 Chapter 4: Reduction of Endomorphisms 


Proof. Let D : C?x C? > C be a nonzero alternating bilinear form. Proceeding 
as in Example 4.1.5 we get 


Eis | be i|) Ee a ae op ({a| tI) 


= (P — (1+ 5i)t+ i(1 + 4é))D (3 |) 


Hence 


es(t) = 0 — (1+ 5i)t + a(1 + 47) 


and the eigenvalues are i and 1 + 42. O 


Definition 4.1.30. Let V be a vector space and let » be an eigenvalue 
of an endomorphism f : V > V. A vector x 4 0 is called an eigenvector 
of f corresponding to the eigenvalue if f(x) = Ax. The set 


Ex ={xeV: f(x) =Ax} 


is called the eigenspace of f corresponding to A. 


It is easy to verify that €) is a subspace of V. It consists of all eigenvectors 
of f corresponding to A and the zero vector. Note that €p = ker f. 


Theorem 4.1.31. Let V be an n-dimensional vector space and let f : 
VV be an endomorphism. If {vi,...,Vn} is a basis of V such that 


f (v1) = 111V1 


f(v2) = t12V1 + £22V2 


f(vn) = UinV1 + Lanve +--+ + Ln—-1,nVn—-1 ate LnnVn; 


where xj. € K for all j,k € {1,...,n} such that j < k, then 


es(t) = (@11 — t) +++ nn — 4). 


4.1. EIGENVALUES AND DIAGONALIZATION 237 


Proof. For any nonzero alternating n-linear form D : V” + K we have 
cp(t)D(v1,..., Vn) = det(f — t)D(v1,..., vn) 
= D(f(v1) — tv1, f(v2) — tve,..., f(vn) — tvn) 
= D(aiivi — tvi, ©12V1 + Le2V2 — tva,..., 
LinVi + LonV2 +++ + En—1nVn—1 + LnnVn — tVn) 
= D(ai1v1 — tv, ©22V2 — tV2,..-,2nnVn — tVvn) 
= (411 — t)-++(@nn — t)D(v1,..-, Vn). 


It turns out that the converse of the above result is also true. 


Theorem 4.1.32. Let V be an n-dimensional vector space and let f : 
VY — V be an endomorphism such that 


es(t) = 1 — t)+- An 4) 


for some A1,...,An € K. Then there is a basis {v1,...,Vn} of V such 
that 


f (v1) = ©11V1 


f (v2) = ®12V1 + £22V2 


f(vn) = LinV1 + LanvVo +-+-+- + Tn—1,nVn-1 ain LnnVn; 
where xj, € K for all j,k € {1,...,n} such that j <k, and 


211 = A1, 222 = A2,---;2nn = An: 


Proof. We are going to use induction on n. Clearly, the theorem holds when 
n= 1. Now let n > 2 and assume that the theorem holds for n — 1. 

If c¢(t) = (Ar — t)- ++ (An — t), then Ay,...,An are eigenvalues of f. Let v1 
be an eigenvector of f corresponding to the eigenvalue \;, that is, f(v1) = Avi 
and v; 4 0. We define V; = Span{v;}. Let W be a vector subspace of V such 
that V = V; 6 W and let p be the projection of V on W along V,. We denote 
by g: W— W the endomorphism induced by pf on W. 

Let {wo,...,Wn} be a basis of W and let D: V” > K be a nonzero alter- 
nating n-linear form. For some yiz € K, where k € {2,...,n}, we have 


f(we) = yi2v1 + g(we) 


f (Wn) = Yinvi + g(Wn)- 


238 Chapter 4: Reduction of Endomorphisms 


If cf is the characteristic polynomial of f and c, is the characteristic polynomial 
of g, then for some 21,4, where k € {2,...,n}, we have 


cp(t)D(v1, Wo, ..-, Wn) = D(f (v1) — tvi, f(we) — two,...,, f(Wn) — twn) 
= - — t)D(v1, yi2v1 + g(we) — twe,---, YinV1 
+ 9(w,) — twr) 

= (A; — t)D(v1, g(w2) — two,...,g(wn) — tw) 

= (Ay — t)cg(t)D(vi, W2,..., Wn). 
Consequently, cy (t) = (A1 — t)cg(t) and thus cg = (Az — t)--- (An —t). By our 
inductive assumption the theorem holds for the endomorphism g : W > W and 
thus there is a basis {v2,...,Vn} of W and aj, € K for j,k € {2,...,n}, 9 <k, 
such that 

g(v2) = %22V2 


g(v3) = L23V2 + £33V3 


g(Vn) = TanV2 t-+- + LnnVn> 


O where %22 = A2,.--,;2nn = An. Consequently, there are 712,...,%1n € K such 
that 


f(v1) = 711V1 


f(v2) = t12V1 + g(v2) = L12V1 + £22V2 


f(vn) = LinVvi + g(Vn) = LinVi + LanVi Fee + Tn—-1,nVn-1 + InnVn; 


where U1 = At, L22 = A2,---,Lnn = An- O 


Definition 4.1.33. Let V be a vector space and let f : V — V be an 


endomorphism. A polynomial p is called an f-annihilator if p(f) = 0. 


Note that in Chapter 3 we define the annihilator of a subset in an inner 
product space. In that context the annihilator is a subspace. The f-annihilator 
of an endomorphism f is a polynomial. 

Every endomorphism f on a finite dimensional vector space has f-annihilators. 
Indeed, if V is an n-dimensional vector space, chee the dimension of the vector 
space L(V) of all endomorphisms f : V a V is n?. Consequently, if f € L(V), 
then the endomorphisms Id, f, f?,..., f n” are cece dependent and thus there 
are numbers %o,%1,.--,;%n2 € K, not all equal to 0, such that 


told+aif + 22f? -F ne ee ie = 


4.1. EIGENVALUES AND DIAGONALIZATION 239 


Example 4.1.34. Let V be a vector space and let f : V — V be an endomor- 
phism. We suppose that dim Y = 4 and that B = {v1, va, v3, v4} is a basis of 
Y. Let 


a 100 
0Oail10 
00al1 
000a 


be the B-matrix of f for some a € K. Show that (t — a)* is an annihilator of 


7 

Proof Since 

f(vi)=av1, f(ve)=vitave, f(v3)=vetav3, f(v4) = v3 + ava, 
which can be written as 

(f—a)(vi)=0, (f—a)(va)=vi, (f—a)(vs)=ve, (f—@)(va) = va, 
Rech 

(f — @)?(v1) = (f —a)*(v2) =0, (f —a)*(vs) = (f — a)va, 

(f — a)?(va) = (f — @)(vs), 

(f — a)°(v1) = (f — a)*(v2) = (f — @)°(v3) =0,  (f — a)?(va) = (f — @)?(va), 
and finally 


(f — a)*(vi) = (f — a)*(v2) = (f — a)*(vs) = (f — @)*(va) = 0. 


Consequently the polynomial (t — a)* is an f-annihilator. O 


It is easy to verify that for the endomorphism f from the above example we 
have c;(t) = (t—a)*. It is not a coincidence that (f — a)* = 0. Actually, this 
is true for many endomorphisms as stated in the next theorem. 

If a polynomial p can be written as p(t) = (Ai — t)---(An — t) for some 
Ai,+--;An € K, then we say that p splits over K. Note that every polynomial 
with complex coefficients splits over C, by the Fundamental Theorem of Algebra, 
but not every polynomial with real coefficients splits over R. 


Theorem 4.1.35 (Cayley-Hamilton). Let V be an n-dimensional vector 
space. If f : V > V is an endomorphism such that its characteristic 


polynomial cz splits over K, then 


cr(f) = 9. 


240 Chapter 4: Reduction of Endomorphisms 


Proof. Let {vi,...,Vn} be a basis of V such that 


f(vi) =®uv1 


f(v2) = ®12V1 + £22V2 


f (vn) = UinV1 + Lanvi +--+ + Tn—-1,nVn—-1 at TnnVns 


where xj, € K for all j,k € {1,...,n} such that 7 < k. Then c(t) = (a1 - 
t)--+(2nn — t), by Theorem 4.1.31. We need to show that cy(f) = (#11 - 
f)-++(@nn — f) = 0. We will show by induction that 


(t11 — f)-++ (253 — f(x) =0 


for every x € Span{vi,..., vj}. 
Clearly, 
(w11 — f)(x) =0 
for every vector x € Span{v,}. Now suppose that for some j € {1,...,n — 1} 
we have 
(t11 — f) +++ (wig — f)(x) =0 
for every x € Span{vi,...,v,;}. Since 
(xgtijg+1 — F)(vj41) = 8541,541V941 — £1,g41V1 — £2,541V2 — +++ — £45415 
— £541,541 Vj 41 
= ~£1,541V1 — %2,541V2 — °° — ©5541 V5, 
we have 


(ta =F)? ae Faget — A) 
= (t11 — f) +++ (@33 — f)(—21,541V1 — %2,541V2 —--- 
= X5541V;) = 0, 


by the inductive assumption. Consequently, cf(f)(x) = (11 — f)--+ (nan — 
f)(x) = 0 for every x € V. Oo 


Example 4.1.36. Verify Cayley-Hamilton theorem for the endomorphism f 
in Example 4.1.34. 


Solution. Let D : V4 + K be a nonzero alternating multilinear form. Then 


D(f(v1) — tvi, f(v2) — tva, f(v3) — tvs, f(va) — tva) 
= D(av, — tvi, v1 + avg — tvg, v2 + av3 — tv3, v3 + avy — tva) 


= D(av, — tv, avg — tve, avs — tv3, av4 — tv4) 
= (a — t)(a — t)(a — t)(a — t)D(V1, va, va, V4) 


oe (a oar t)*D(v1, V2, V3, V4). 


4.1. EIGENVALUES AND DIAGONALIZATION 241 


This shows that c¢(t) = (a — t)*. Thus c¢(f) = 0, by Example 4.1.34. O 


Example 4.1.37. Verify Cayley-Hamilton Theorem for the endomorphism 


1) IE 


Solution. Let D : R? x R* + R be a nonzero alternating bilinear form. Pro- 
ceeding as in Example 4.1.5 we obtain 


> ([a'] [s*d) e+ (lo) ET) 


ef(t) = t? —7#+11 


Hence 


and we have 


Ba ae enti alktls calli al=lp ab 


Example 4.1.38. Let f : V > VY be an endomorphism. We suppose that t?+¢ 
is a f-annihilator. Show that, if X is an eigenvalue of f, then » € {0, —7, 7}. 


Solution. Let x be an eigenvector corresponding to an eigenvalue \ of f. Then 
(f° + f(x) = (A + A)x = 0. 


Since x # 0, we have \? + A = 0, which gives us the desired result. Oo 


Theorem 4.1.39. Let V be a finite dimensional vector space and let 
f:V— VY be a nonzero endomorphism. 


(a) There is a unique monic polynomial my of smallest positive degree 
such that m(f) = 0; 


(b) my divides every f-annihilator; 


(c) If A is an eigenvalue of f then ms(A) = 0. 


242 Chapter 4: Reduction of Endomorphisms 


Proof. Note that cy is an f-annihilator. Recall that a monic polynomial is a 
single-variable polynomial in which the leading coefficient is equal to 1. 

Let p be a monic polynomial of smallest positive degree such that p(f) = 0 
and let g be a polynomial of positive degree such that g(f) = 0. Then g = pa+r, 
where a and r are polynomials and r = 0 or the degree of r is strictly less than 
the degree of p. Since 


af) = p(f)a(f) + r(f), 


we have r(f) = 0. Now, because p is a polynomial of smallest positive degree 
such that p(f) = 0, we have r = 0 and thus p divides q. 

If g is another monic polynomial of smallest positive degree such that q(f) = 
0, then the equality g = pa implies a = 1 and consequently p = q. 

Finally, if v is an eigenvector corresponding to the eigenvalue \, then 


my(f)(v) = mA), 


which gives us m(A) = 0 because v # 0. Oo 


Definition 4.1.40. Let V be a finite dimensional vector space and let 
f :V—- Y be a nonzero endomorphism. The unique monic polynomial 


my of smallest positive degree such that my(f) = 0 is called the minimal 
polynomial of f. 


Example 4.1.41. Let V be a vector space such that dimV = 4 and let f : 
VY — V be an endomorphism. If B = {v1, v2, v3, v4} is a basis of V and 


a 100 
0Oail10 
00al1 
000a 


is the B-matrix of f for some a € K, find ms. 


Solution. From Example 4.1.34 we know that the polynomial (t — a)‘ is an 
annihilator of f. We note that (t—)* is the minimal polynomial m+ because 


(f — a)°(va) = (f — @)?(v3) = (f — a)(v2) = vi £0. 


4.1. EIGENVALUES AND DIAGONALIZATION 243 


Theorem 4.1.42. Let V, and V2 be finite dimensional vector spaces 
and let V=V, ®ve. If fi : Vi — Vy and fz : V2 > Veo are nonzero 
endomorphisms and f : V — V is the endomorphism defined by 


f(v1 + v2) = fi(vi) + fave) 
where v1, € Vi and v2 € Vo, then 


Ne LCM(my, ’ M fy) 


Proof. Let p= LCM(my,,,m,,) and let vi € V; and v2 € V2. Then 


P(F)(v1 + v2) = p(f)(v1) + p(f) (v2) = p(fi) (v1) + p(fe)(v2) = 9, 


because both my, and my, divide p. This implies that my divides p. 
Now, since my(f)(v) = myf(f1)(v) = O for every v € Vi, my, divides ms. 
Similarly, m;, divides my. Consequently p divides ms. O 


Theorem 4.1.43. Let V, and V2 be finite dimensional vector spaces and 
let V=V, Ovo. If fi: Vi 4 Vi and fo: V2 > Vo are endomorphisms 
and f:V—- YV is the endomorphism defined by 

f(vi + v2) = fi(vi) + fo(va) 


where v1 € Vi and v2 € vo, then 


Cf > CfyrCfa: 


Proof. This result is a consequence of Theorem 4.1.24. O 


The following result is of significant importance for the remainder of this 
chapter. 


Theorem 4.1.44. Let V be a finite dimensional vector space and let 
f:V—-Y be an endomorphism. If p1,...,pp are polynomials such that 
GCD(p;,p1) = 1 for every j,1 € {1,...,k} such that 7 # 1 and the 
product p,---pr is an f-annthilator, then 


V = kerpi(f) @--- @ kerp,(f). 


Proof. Let qj = pi-++pj—-1Pj41°*+ Pe for j € {1,...,k}. Clearly GCD(m,..., 
qx) = 1 and thus there are polynomials a1,...,a% such that 


aig +++ + ange = 1. 


244 Chapter 4: Reduction of Endomorphisms 


Consequently, 


and thus 
a(f)a(f)(v) +:+++ an(fae(f)(v) =v (4.2) 


for every v € V. Note that for every j € {1,...,n} we have 
py (faj (fa (f)(v) = a5 Pps (Pai (P)(v) = a5 (Spi (f) + pe(f)(v) = 9, 


and thus a;(f)q;(f)(v) € kerp;(f). Hence, by (4.2), we have 


We need to show that the sum is direct. 
Let vj € kerp,(f) for 7 € {1,...,n}. From (4.2) we get 


ai(f)a(f)(vy) +++ + an( fae (Ff) (v5) = v5- 


Since ai(f)qi(f)(vj;) = 0 for? € {1,...,7-—1,7+1,...,k}, we have 


ai(f)a(f (vz) +2°* + ax(far(f) (vi) = a3 (fay (S) (v5) 


and thus 
vj = a;(f)aj(f) (vs). (4.3) 


Suppose now that the vectors v; € kerp,(f), 7 € {1,...,&}, are such that 
vi tes +v_ =0. 
Then for every j € {1,...,k} we have 
0 = a;(f)qj(f)(0) = a; (fay (f)(vi + +++ + ve) = aj (FP) (P)(v5) 


because a;(f)q;(f)(vi) = 0 for 1 € {1,...,7 -1,j7+1,...,k}. Hence, by (4.3), 
we have v; = a;(f)q;(f)(v;) = 0, which shows that the sum ker p;(f) +--+ + 
ker px(f) is direct. Oo 


Invariance of a subspace with respect to a linear operator was initially in- 
troduced in Chapter 2 in exercises and then repeated in Chapter 3. For con- 
venience we recall that a subspace U of a vector space V is called f-invariant, 
where f : V > V is an endomorphism, if f(/) C U. Invariance of subspaces will 
play an important role in this chapter. 

It is easy to see that, if V is a vector space and f : V > Y is an endomor- 
phism, then ker f and ran f are f-invariant subspaces. More generally, if p is a 
polynomial, then ker p(f) is an f-invariant subspace. 


4.1. EIGENVALUES AND DIAGONALIZATION 245 


Theorem 4.1.45. Let V be an n-dimensional vector space and let f : 
VY — V be an endomorphism. If 


ep(t) = (a8 e 


for some distinct \1,...,A”n € K and some positive integers r1,...,1k, 
then 
VY = ker(f — A1)"! @--- @ ker(f — Ax)" 


and 
cf, (t) = (E— Aj) 


for every j € {1,...,k} where f; : ker(f — A;)") — ker(f — Aj)" is the 
endomorphism induced by f. 


Proof. From Theorem 4.1.44 applied to the polynomials (A; —t)",..., (Ap —t)"* 
we get 
Y= ker(f a i)" OB: @ ker(f = Ag)". 


Now, for every j € {1,...,k}, the polynomial (A; — t)" is an f;-annihilator. 
This implies, by Theorem 4.1.39, that my, = (t — A1)% where q; is an integer 
such that 1 <q; < rj and cy,(t) = (Aj — t)*? where s; is an integer such that 
qj < Sj. 


From Theorem 4.1.43 we get 


Cf = Cf, -+-Cfps 
that is, 
(Ay — t)"h +++ (Ap — t)"® = (Ay — £1 ++ (Ag — t)** 
Hence r; = s; for every j € {1,...,k}, completing the proof. oO 


Corollary 4.1.46. Let V be an n-dimensional vector space and let f : 
V— V be an endomorphism. If 


cp(8) = On =)" Oe 8) 


for some distinct \1,...,A”n € K and some positive integers r1,...,Tk, 
then for every j € {1,...,k} we have 


dim ker(f — Aj)" = 15. 


Proof. This follows from the fact that 
deg cr, (t) = deg(t — A3)"4 = 1; 


for every 7 € {1,..., k}. O 


246 Chapter 4: Reduction of Endomorphisms 


Definition 4.1.47. An endomorphism f : YV — Y is called diagonaliz- 


able if V has a basis consisting of eigenvectors of f. 


If V is an n-dimensional vector space and an endomorphism f : V > V 
is diagonalizable, then that is there is a basis B = {vj,...,vn} of V and 
Ai,--+;An € K such that f(v;) = A;v,; for every j € {1,...,n}. In other 
words, the 6-matrix of f is diagonal: 


Me Oo eek oD 
O-e.2e- 
GO) eax % 


The following result is a direct consequence of the definitions. 


Theorem 4.1.48. Let V be an n-dimensional vector space. An endo- 


morphism f :V — V is diagonalizable if and only if it has n linearly 
independent eigenvectors. 


If \ is an eigenvalue of an endomorphism f : V > V, then the characteristic 
polynomial of f can be written as 


det(f —t) = (A—#)"q(t), 


where q(t) is a polynomial such that q(A) 4 0. The number r is called the 
algebraic multiplicity of the eigenvalue 4. 

The dimension of the eigenspace of f corresponding to an eigenvalue A, that 
is dim ker(f — A), is called the geometric multiplicity of A. 

The algebraic multiplicity of an endomorphism need not be the same as the 
geometric multiplicity. Indeed, consider the endomorphism f : R* > R? defined 
as f(x) = Ax where 


311 
A=1]|0 30 
003 
Since det(f — t) = (3 — t)3, f has only one eigenvalue \ = 3 whose algebraic 


multiplicity is 3. On the other hand, since the dimension of the null space of 
the matrix 


011 
000 
000 


is 2, the geometric multiplicity of the eigenvalue 3 is 2. 


4.1. EIGENVALUES AND DIAGONALIZATION 247 


Theorem 4.1.49. The geometric multiplicity is less than or equal to the 


algebraic multiplicity. 


Proof. Let VY be an n-dimensional vector space and let f : V > V be an endo- 
morphism. If dimker(f — A) = &, then there is a basis {vi,...,Wn} of V such 
that v1,...,Vx% are eigenvectors of f corresponding to the eigenvalue 4. 

For any nonzero n-linear alternating form D: V” > K we have 


CAL) D(Vigs ins Vig VAs ig Vn) 
= D(f(v1) — tv, ..-, f (ve) — tv, F(vegi) — tveqi,--- f(vn) — tvn) 
= D(Avy — tv1,..., Ave — tye, f (Vet) — tvedi,---;f(Vn) — tvn) 
= (A—t)*D(v1,..., vi, f (veri) — tvegi,---sf (Vn) — tvn)- 
Consequently, (A — t)* divides cy because 
D(v1,.--, Ve, f(We41) — tvegi,---sf(Vn) — tvn) = g(t)D(v1,.--,Vk, Vett;- ++; Vn); 


where q is a polynomial. oO 


In Theorem 3.4.10 we show that eigenvectors corresponding to different 
eigenvalues of a normal operator on an inner product space V are orthogonal. 
If V is an arbitrary vector space, then we cannot talk about orthogonality of 
eigenvectors, but we still have linear independence of eigenvectors corresponding 
to different eigenvalues, as the following theorem states. 


Theorem 4.1.50. Let V be a vector space and let f : V > V be an 
endomorphism. If vi,...,VzE € V are eigenvectors of f corresponding to 


distinct eigenvalues 1,...,Ax, then the vectors v1,...,Vx% are linearly 
independent. 


First proof. If 
UV, +--+ +¢ev, =O 


for some 21,...,%% € K, then 
(f — Ar) +++ (fF — An-1)(@1¥1 + +++ + eV) = 0 


and thus 
te(f — Ar)-++ (Ff — Aw-1) (Ve) = 0. 


Since 


(f — Ar) ++: (f — An—1)(ve) 
= ((f — Ak) + Ak — A1)) ++ (CF = Aw) + Ag = Ax-1))) (Ve) 
= (Ap — Ai) - ++ (Ak — Ag—1) VR: 


248 Chapter 4: Reduction of Endomorphisms 


we get 
UR (Ak — A1) tee (Ak — Ak—-1) Vk = 0. 


Consequently x, = 0, because (Ax — A1) +++ (Ax — Ax—1) # 0 and vy, 4 0. In the 


same way we can show that 71 =--- = x,_-1 = 0. O 
Second proof. Since the subspace Span{vj,..., vx} is f-invariant, without loss 
of generality we can assume that V = Span{vi,...,v,}. Then the eigenvalues 


are roots of the polynomial cr which has the degree equal to dim V and conse- 
quently k < dimV. But V = Span{vj,...,vz}, so we must have k > dimV. 
Thus k = dim Y and the vectors vi,..., Vx are linearly independent. oO 


Theorem 4.1.51. Let V be an n-dimensional vector space and let f : 
Y— V be an endomorphism. The following conditions are equivalent: 


(a) f is diagonalizable; 


(b) There are distinct \1,...,A” € K and positive integers r1,...,1r 
such that 


cp(t) = (Ar —t) +++ An— 8)" and dimker(f — Aj) = 1;; 


(c) If A1,...,AxK are all distinct eigenvalues of f, then 
ker(f — A1) ®@--: @ker(f — Ax) = V; 
(d) If A1,..-,An are all distinct eigenvalues of f, then 


n = dim ker(f — A) +--- + dimker(f — Ax); 


(e) If A1,..-,A~% are all distinct eigenvalues of f, then 


my(t) = (t— dn)-+- (t= Ak): 


Proof. First we prove equivalence of (a) and (b). Then we show that (c) implies 
(d), (d) implies (a), (a) implies (e), and (e) implies (c). 
If f : V > V is diagonalizable, then there is a basis {vi,...,Vn} of V such 
that 
(vy) = Ag; 
for some Aj,...,An € K and all j € {1,...,n}. If D: Vv" > K is a nonzero 
n-linear alternating form, then 


D(f (v1) = tviy...5 (Vn) — t¥n) = D((A1 = tv, An = 8)Vn) 
= (Ay — #)+++ (An —#)D(v1,---, Vn) 


4.1. EIGENVALUES AND DIAGONALIZATION 249 


and thus cy(t) = (Ai —t)--- (An —t). Without loss of generality, we can suppose 
that A1,...,Ax% are distinct numbers such that {Ag+1,---,An} © {A1,..-, AK}. 
Consequently, 

eg(t) = (Ar —t)™... (An — 8)* 


where r1,...,7 are positive integers such that rj +---+rp =n. 
Because for every | € {1,...,n} there is aj € {1,...,k} such that v; € 
ker(f — A;), we have 


ker(f — A1) +-+-+ker(f — Ax) = V. 


Now since for every j € {1,...,k} we have ker(f — A1) C ker(f — A;)" and, 
by Corollary 4.1.46, dim ker(f — 4;)") = 1;, we conclude that 


dim ker(f — Aj) =r; 


for every j € {1,...,k}. Thus (a) implies (b). 
Now suppose that we can write 


c(t) = (Ai = ty)" deck (Ak _ ty" 
where \j,...,Ax € K are distinct and r,,...,r% are positive integers such that 
dimker(f — A;) = r; for every 7 € {1,...,k}. Then, by Corollary 4.1.46, 
ker(f — Aj) = ker(f — A;)"% and, by Theorem 4.1.45, the sum 

ker(f mg M1) ee ker(f = Ak) 


is direct and we have 


ker(f — A1) ®--: @ker(f — Ax) = V. 


Consequently, if B; is a basis of ker(f —,;) for j € {1,...,k}, then 6, U---UB, 
is a basis of V because 7} +--- +r, =n. Since all elements of B, U---UB, are 
eigenvectors, f is diagonalizable. Thus (b) implies (a). 

Clearly (c) implies (d). 

Next suppose that 


dim ker(f — A1) +--+: +dimker(f — Ax) =n. 
Since ker(f — Aj) C ker(f — A;)", it follows from Theorem 4.1.45 that the sum 


ker(f M1) raver] ker(f Ak) 


is direct. Now we construct a basis of eigenvectors of f as in the proof of the 
first part ((b) implies (a)). This proves that (d) implies (a). 

Next assume that the endomorphism f satisfies (a). Since ({—A1)...(t—Ax) 
is an annihilator of f and my is a monic polynomial which has all distinct 
eigenvalues as zeros and divides every annihilator of f, we have 


my(t) = ((—A1)...(€— Az), 


250 Chapter 4: Reduction of Endomorphisms 


so (a) implies (e). 
Finally, (e) implies (c), because if m p(t) = (t— A1)...(t — Ax), then 


ker(f — A1) ®-:: ® ker(f — Ax) = V, 
by Theorem 4.1.44. Oo 


Example 4.1.52. Let f : R? — R? be the endomorphism defined by 


(tl) =a sl bo: 


Show that f is not diagonalizable. 


Proof. Let D : R?x R? > R be a nonzero alternating bilinear form. Proceeding 
as in Example 4.1.5 we obtain 


("a") [s=d) <ea-oe-o+9m([f]fi]) «9° ([] fi): 


Hence cr(t) = (t — 3)”. 
Next we determine the eigenvectors corresponding to the eigenvalue 3. The 


i Leafy 


is equivalent to the equation x = 2y. Consequently, the eigenspace correspond- 


ing to the eigenvalue 3 is 
€3 = Span { A \ : 


Because dim €3 < 2, the endomorphism f is not diagonalizable. O 


Example 4.1.53. Find f” if f : R? — R? is the endomorphism defined by 


(GL) =B al bal 


Proof. The eigenvalues are given by the equation 


(4—t)(3-t)-2=0a 
2 and 5. It is easy to verify that E2 = Span { i | \ and €5 = Span { Fi 


palo a}= [2 alld: 


Since 


4.1. EIGENVALUES AND DIAGONALIZATION 251 


we have 


and thus 


bal Leaf s|[2i) =3 


Consequently 
fp x ee 2742-5" —2"+45") |x 
y aa 3 —gntl de 2 i 5n gn+l sits 5n y S 


Example 4.1.54. Show that the endomorphism f : R? + R® defined by 


8 


oo) dE P|) I[ee 
= |i 3) |e 
ah All |Ie2 


as 
< 


R 


is diagonalizable. 


Solution. According to Example 4.1.28 we have c(t) = (2 — t)?(6 —t). It is 
enough to show that dim ker(f — 2) = 2. The equation 


Se 2 ie x 
13 2 ay S20 y 
Tea 2 Zz z 


is equivalent to the equation x + y+ 2z = 0. Consequently, 


He ip ye =I) —2 
Ca li y SU el iore Ol 
Z eZ 0 il 
which means that 
-1 —2 
€,= Span 1], 0 
0 il 
-1 —2 
Since the vectors 1} ,} Oj] are linearly independent, dim €2 = 2. O 


0 1 


252 Chapter 4: Reduction of Endomorphisms 


Example 4.1.55. Let V be an n-dimensional vector space and let f : V4 V 
be an endomorphism such that m y(t) = t(t+1) and dimker f = k. Determine 
Cf. 


Solution. The endomorphism f is diagonalizable by Theorem 4.1.51. The only 
eigenvalues are 0 and —1. Let {v1,...,v,} be a basis of eigenvectors such that 
{vi,..., Vx} is a basis of ker f = Eo and {vx41,..-, Vn} is a basis of E_1. Then 


f(vj)=0 if1<j<k 
and 


Consequently 
Ce) (ee * 


4.2 Jordan canonical form 


Diagonalizable endomorphisms have many good properties that are useful in 
applications. However, we often have to deal with endomorphisms that are not 
diagonalizable. In this section we study the structure of such endomorphisms. 


4.2.1. Jordan canonical form when the characteristic 
polynomial has one root 


We begin by considering two examples. 


Example 4.2.1. Let f : R? > R? be the endomorphism defined by 


(il) = [4 la) 


Show that f is not diagonalizable and find a basis B of R? such that the matrix 
of f in B is E ; : 

Proof. First we find that cr(t) = (t — 3)? and ker(f — 3) = Span { ; \ This 
shows that f is not diagonalizable because dim ker(f — 3) = 1. 


Next we choose a vector that is not in ker(f—3), for example 4 . The vec- 


4.2. JORDAN CANONICAL FORM 253 


tor (f —3) a is in ker(f — 3) because according to Cayley-Hamilton theorem 


sre 0 el" 
(ote ae ee 


is a basis with the desired property. 
Note that m(t) = (t — 3)?. Oo 


we have 


Example 4.2.2. Let V be a vector space such that dim V = 3. Let f: V4 V 
be a linear transformation and let a € K. If my(t) = (t— a), show that there 
is a basis P of V such that the P-matrix of the linear transformation f is 


S.-oue 


1 0 
al 
Oa 
Solution. Let v be a vector in ker(f — a)? = V that is not in ker(f —a)?. Then 
(f — a)v € ker(f — a)? and (f — a)v ¢ ker(f — a). We will show that 

P= ne = a)?v, Gp = a)v,v} 


is a basis with the desired property. 
Let 71, 22,23 € K be such that 


x31(f —a)°v + 22(f —a)v + z3v = 0. (4.4) 
By applying (f — a)? to (4.4) we get 
r3(f —a)*v =0 
and consequently r3 = 0. Next, by applying f — a to the equality 
gil f —a)’x+2a(f —a)v =0 


we get 
ro(f —a)?v =0 


which gives us 72 = 0. Now (4.4) becomes 


ai(f —a)?v =0 


254 Chapter 4: Reduction of Endomorphisms 


which gives us 2; = 0. This shows that the vectors (f — a)?v, (f — a)v, and 
v are linearly independent and consequently 


P= {(f = a)?v, (f = a)v,v} 


is a basis of V. 
Since 


f((f — a)’v) = (f — a)(f —a)’v + a(f —a)?v 


f Say att Say (i Sojv al a)y,; 


The minimal polynomial of an endomorphism f : V > Y on a finite dimen- 
sional vector space is defined as the unique monic polynomial my of smallest 
positive degree such that my(f) = 0, that is, my(f)(v) = O for every v € V. 
Now we consider a similar property relative to a fixed v € V. 


Example 4.2.3. Let f : ¥V — V be an endomorphism on a vector space V and 
let x € V. If (f — a)3(x) = 0, show that the subspace Span {x, f(x), f?(x)} 
is f-invariant. 


Proof. The equality 
(f — a)°(x) = (f? — 3af? + 307 f — a) (x) = 0 


gives us 
fP(x) = (af? — 307 f + a*)(x) 


and then, for every integer n > 3, we get 
Gl = Sat =e fob ae = Nols 
This implies by induction that for every nonnegative integer n we have 
f(x) € Span {x, f(x), f?(x)} 


Our result is an immediate consequence of this fact. O 


4.2. JORDAN CANONICAL FORM 255 


Theorem 4.2.4. Let V be a finite dimensional vector space, f : ¥V > V 
a nonzero endomorphism, and v a nonzero vector in V. Then 


(a) There is a unique monic polynomial my, of smallest positive degree 
such that mp v(f)(v) = 0; 


(b) my divides every polynomial p such that p(f)(v) = 0; 


(c) If the degree of my is k, then the vectors v, f(v),...,f* 1(v) are 
linearly independent and Span {v, f(v),...,f*~"(v)} is the small- 
est f-invariant subspace of V containing v. 


Proof. (a) and (b) can be obtained as in the proof of Theorem 4.1.39. 

If the degree of my,y is k, then the vectors v, f(v),...,f*~1(v) are linearly 
independent because, if xo,...,%~—-1 € K are not all 0, then the degree of the 
polynomial zp + wjt+---+ xt, t*—" is strictly less than the degree of my vy. 

To show that the subspace Span {v, f(v),...,f*~!(v)} is f-invariant it is 
enough to show that f"(v) is in this subspace for every positive integer n. 
Indeed, there are polynomials g and r such that 


= my v(t)q(t) + W(t), 


where r = 0 or r ¥ O and the degree of r is strictly less than the degree of 
my,v. Thus f"(v) = r(f)(v) and the subspace Span {v, f(v),..., f*-1(v)} is 
f-invariant. 

Clearly, if U is an f-invariant subspace of V such that v ©€ U, then 
Span {v, f(v),...,f* (vw) } CU. oO 


Definition 4.2.5. Let V be a vector space, f : YV ~ V a nonzero en- 
domorphism and v a nonzero vector in V such that the degree of my 
is k. The f-invariant subspace Span{v, f(v),..., f*~!(v)} is called the 
cyclic subspace of f associated with v and is denoted by Vy y. 


Theorem 4.2.6. Let V be a finite dimensional vector space and let f : 
Vv V be an endomorphism such that m¢(t) = (t—a)* for some integer 
k>1 and somea€ K. Then there is a nonzero vector v € V such that 
Mfv = Mf. 


Proof. Let v € V be such that (f — a)*v = 0 and (f — a)*-!v 4 0. Then 
ms (t) = (t-—a)¥ = mz y(t). oO 


256 Chapter 4: Reduction of Endomorphisms 


Theorem 4.2.7. Let V be a finite dimensional vector space and let f : 
VY — V be an endomorphism. For a nonzero vector v € V the following 
conditions are equivalent. 


(a) mpy =(t—a)* for some integer k > 1 and some a € K; 


(b) The set B= {(f —a)*1v,...,(f — a)v,v} is a basis of Vey; 


(c) There is a basis B = {vi,...,Ve—-1, Vv} of an f-invariant subspace 
U of V such that the B-matriz of the endomorphism g :U > U 
induced by f is 


Proof. Suppose that my, = (t— a)*. Because for every j € {0,...,k — 1} we 
have 


PW) =F -atal@) = —a) ()+jolf—a) “W) 44 ye? "F— av) + oy, 


we get 


Vyv = Span{v, f(v),...,f*"*(v)} © Span{v, (f — a)v,...,(f — a)**v}. 
Hence 
Vpv = Span{v, f(v),...,f*""(v)} = Span{v, (f —a)v,...,(f—a)*v}, 
because the vectors v, f(v),..., f*~!(v) are linearly independent. Consequently 
B= {(f—a)y,...,(f—a)v,v} 
is a basis of Vy. This proves (a) implies (b). 


Suppose now that B = {(f —a)*-!v,...,(f — a)v,v} is a basis of Vpy. In 
order to find the B-matrix of f we note that 


f((f —a)**(v)) = (fF — a)*(v) + a(f — a)" (v) = af — a)? (v) 


4.2. JORDAN CANONICAL FORM 257 


and 
f((f-a)) (v)) = (f-a)**(v) +a(f —a)) (v) =1-(f-a)**(v) +a(f—a)/(v) 


for every 7 € {0,...,k—2}. To prove that (b) implies (c) we take U = Vy,. 

Now suppose that there is a basis B = {vj,...,Vx—-1, Vv} of an f-invariant 
subspace U of V such that the B-matrix of the endomorphism g :U — U induced 
by f is 


— 
Oo 
Oo 
j=) 


a 
0 
0 


0000...a 1 
0000... 0a 
Then 
(f—a)vi =0 
(fF —a)Ve= vi 
(f — Q)VR-1 = VE-2 
(f —a)v = vz-1.- 
Consequently 
(f—a)’tv=v,40 and (f—a)'v=0. 
Hence my, = (t — a)*, which proves that (c) implies (a). Oo 


Example 4.2.8. Let f : P3(R) — P3(R) be the endomorphism defined by 
f (p) =p’. Determine my, V5 n241 and my 241. 


Proof. Since 


f(x? +1) = 22, f(2x)=2, and f(2) =0, 


we have mfn241 = t?, Vpn241 = Po(R), and m;(t) = t*. Oo 


258 Chapter 4: Reduction of Endomorphisms 


Definition 4.2.9. Let a € K and let k be a positive integer. The k x k 
matrix 


0 
0 


is called a Jordan block and is denoted by Ja,~. The 1 x 1 matrix [a] is 
also considered a Jordan block. 


Example 4.2.10. Let f : R® — R® be the endomorphism defined by 


z My 5) | | be 
Filly} |=] 05-1) Jy 
2, = 7 § Zz 


Show that m(t) = (t — 4)? and find a basis of R® such that the matrix of 
f in that basis is the Jordan block 


410 
041 
004 


Proof. First, proceeding as in Example 4.1.28, we get cr(t) = (4—t)?. Hence, 
by Cayley-Hamilton theorem, we have (f — 4)? = 0. 
To determine ker(f — 4)? we solve the equation 


boa 40 0\\ Iz 0 
Opie eal ied l= HO 
Das 65 004 2 0 


that is, 
3-3 —6| |a 0 
1 -1 —2] |y| = [ol], 
1 -—1 —2] |z 0 


4.2. JORDAN CANONICAL FORM 259 


which gives us x — y — 2z = 0. Consequently 


a 
ker(f — 4)? = < |y| ER? ¢ —y—27—0 
z 


Because ker(f — 4)? 4 V, we have ms(t) = (t — 4)?. 


0 
Since }1] ¢ ker(f — 4)?, according to Theorem 4.2.7, the set 


303 6) FO oe Gen G 
Pe a faery ene WV i ge Dae EV 
NO ea tO e(0 
ars) io 
See atta 
0 tel) (ho 


il 


is a basis satisfying the required condition. Note that m ;  =(t—4)°. O 
f, 
0 


In all examples discussed so far the matrix of the linear transformation had 
one Jordan block. Now we consider two examples of transformations with two 
Jordan blocks. 

In the next several examples we added horizontal and vertical lines in the 
matrix to visualize and separate different Jordan blocks. The lines have no other 
mathematical meaning. 


Example 4.2.11. Let f : R? — R® be the endomorphism defined by 


x O24 iia 
if Yy = 07 O} ly 
os -119 Zz 


7 1/0 
Find a basis of R® such that the matrix of f in that basis is | 0 7 | 0 
0 O17 


Proof. First we find that cr(t) = (7—t)?. We determine ker(f — 7) by solving 


260 Chapter 4: Reduction of Endomorphisms 


the equation 


524 700 i 0 
07 0)/-— |0 7 0 y| = |O], 
-119 007 2, 0 
that is, 
—2 24] |a 0 
00 0} Jy} = {0}, 
—-1 1 2] |z 0 


an endomorphism which gives us x — y — 2z = 0. Consequently 
ae 


ker(f — 7) = y| €R®,2—y-—2z=0 
z 


Since dim ker(f — 7) = 2, the endomorphism f is not diagonalizable. It is easy 


0 
to see that my (t) = (t—7)?. Because |0} ¢ ker(f —7) and dim ker(f —7) = 2, 
1 
we have 
0 
ker(f — 7) @Span¢ |0| } =R°. (4.5) 
il 
it 1 0 
Now, since }1] € ker(f — 7) and the eigenvectors |1} and (f — 7) |0| = 
0 0 1 


0| are linearly independent, it is easy to verify, using (4.5), that the set 
2 


0 0 1 4 0 1 
(f —7) |O] ,}O],}1 = 05 0}, | 
iL 1 0 2 1 0 
is a basis satisfying the required condition. 
Note that m fy) = (t— Ce =e Oo 
f,|0 
1 


Example 4.2.12. Let V be a vector space such that dimV = 5. Let f: Vo V 
be an endomorphism and let a € K. We assume that cr(t) = (a — t)? and 


4.2. JORDAN CANONICAL FORM 261 
my (t) = (t- a)3. If 
dimker(f —a@)=2 and dimker(f — a)? =4, 


show that there is a basis P of V such that the P-matrix of f is 


Solution. Let u be a vector in ker(f —a)? which is not in ker(f —a)?. Then the 
vector (f—a)u is in ker(f—a)? and not in ker(f—qa). Because dim ker(f—a)? = 
4 and dimker(f — a) = 2, we can choose a vector v € ker(f — a)? such that 
{(f —a)u, v} is a basis of a complement to ker(f — a) in ker(f — a@)?. We will 


show that 
{(f | a)*u, G in a)u, u, (f — a)y, v} 


is a basis that has the desired property. 
First we show that the vectors (f—a)?u, (f —a)u, u, (f —a)v, v are linearly 
independent. If 


zi(f —a)?u+ 2o(f — a)u+ z3u4 24(f —a)v+a¢5v = 0 (4.6) 


for some 21, %2,73,%4,25 € K, by applying (f — a)? to the above equation we 
obtain 


23(f —a)*u=0 
and consequently x3 = 0. Next, by applying f — a to (4.6) we get 
xo(f —a)°u+a5(f — a)v = (f — a)(x2(f —a)ut+a5v) = 0 


which yields that x2 = x5 = 0 because {(f—«)u, v} is a basis of a complement 
to ker(f — a) in ker(f — a)?. Now (4.6) becomes 


ai(f —o)?u+ 24(f — a)v = (f — a)(a1(f —a)u+a2av) = 0 
which gives us 21 = 24 = 0 again because {(f — a)u, v} is a basis of a com- 
plement to ker(f — a) in ker(f — a)?. 


We have proved that the vectors (f — a)?u,(f — a)u,u,(f — a)v,v are 
linearly independent and consequently 


P = {(f—a)?u,(f—a)u,u,(f —a)v,v} 


262 Chapter 4: Reduction of Endomorphisms 


is a basis of V. Since 


f((f — a)?u) = (f — a)(f —a)*uta(f—a)u=a(f—a)u 
f((f — au) = (f —a)(f —a)uta(f—a)u=(f —a)?ut+a(f—a)u 
f(u) =(f-—a)u+au 
J =a)v) = (7 Sa)(f >a)v + aj —a)v — al fav 
f(v) =(f-—a)v+av 


the P-matrix of f is 


Note that the assumption that dim ker(f — a)? = 4 is unnecessary because, 
if u and v were linearly independent vectors in a complement of the vector 
subspace ker(f — a)”, then we could show, as before, that the vectors 


a = a)?u, i = a)u, u, Gi = a)?v, a _ Q)v, ME 


are linearly independent, which is not possible because dim V = 5. O 


The next result plays a central role this and the following sections. The 
proof is difficult in comparison with the other proofs presented in this book so 
far and it can be skipped at the first reading. On the other hand, understanding 
the proof, which uses many ideas presented in this book, is a good indication 
that you understand linear algebra at the level expected in a second course. 


Lemma 4.2.13 (Fundamental lemma). Let V be an n-dimensional vec- 
tor space and let f : V > V be an endomorphism. Let v € V be such 
that 

my (t) = mpv(t) = ao tait+---4+ teat AE. 


for some k € {1,2,...,n} and ao,ai,...,@n-1 € K. Then there is an 
f-invariant subspace W such that 


Span {v, f(v),...,f* "w)} EWe=Y. 


Proof. Let {v, bia eee ae bos Chee Wal be a basis of V and let g: V > 
K be the linear functional such that g(f?(v)) = 0 for j < k-1, g(f*"1(v)) = 1, 
and g(v;) =0 for 7 > k. We define 


W = {we V: 9(f?(w)) = 0,7 € {0,1,2,...}}. 


4.2. JORDAN CANONICAL FORM 263 


Clearly, W is a subspace of V. Because for every 7 € {0,1,2,...} and every 
w € W we have g(f?(f(w))) = g(f2*1(w)) = 0, the subspace W is f-invariant. 
Now we prove that the sum Spanfv,..., f*~!(v)} + W is direct. Suppose 
w=21v+aof(v)+---+axf*’ lv) ew 
for some 71,...,%% € K. Then g(w) = 2, = 0 and thus g(f(w)) = r,_1 = 0, 
because 


f(w) =aif(v) +-++ + api f®'(v) + te f*(v) = aif (vy) +++ + opi f*"(v). 


Continuing this way we show that x, = ®p-1 = Te-2 = ++: = X1 = 0, so the 
sum is direct. 
Now, since mys(f) = myv(f) = 0, we have 


f* = ag — anf —---— ax f*" 
and then, for every integer | > 0, 
fPH =-agfl-aftt—----aqiuft*, 


which yields 
Span { f?,j € {0,1,2,...}} =Span {f’,j € {0,1,...,k-—1}}. 
Consequently 
W = {weV: 9(f?(w)) =0,7 € {0,...,4—1}} 
={weV: (gf’)(w) =0,7 € {0,...,k—1}}. 
Next we show that the functionals g,gf,...,gf*~+ are linearly independent. 


Suppose 
aig+aogft+---t+angf*'=0 


for some 21,...,0% € K. Applying this equality successively to v, f(v), f?(v), 
...,f*-1(v) we get 
g(aiv +++ + ae f*"*(v)) =0, 


g(aif(v) ++-++anf*(v)) =0, 


g(aif*—*(v) +++- + on f?*-?(v)) =0, 
which can be written as 
g(aiv +--+ +anf**(v)) =0, 
g(f(aiv +--+ +2anf*"'(v))) =0, 


gf" "(av +++ + anf" (v))) = 0. 
This shows that 2,;v+--:+a,f*-!(v) € W and thus 2; =--- = x, = 0, because 
the sum Span{v,..., f*~!(v)} + W is direct. Consequently the functionals 
g,9f,-..,gf*—' are linearly independent. By Theorem 2.4.14, we get dim W = 
dim VY — k, which completes the proof. O 


264 Chapter 4: Reduction of Endomorphisms 


Theorem 4.2.14. Let V be a finite dimensional vector space and let 
f:V¥— ¥V be an endomorphism. If m z(t) = (t- a)" for some integer 
n> 1 anda € K, then there are nonzero vectors v1,...,Vr € V such 
that 


V=Vy~, B:-:O Vey, 


and 
Mpvi = (ae ar om a (t—a)*r, 


where k,,...,k, are integers such that 1 < kp < +++ < ky =n. 


Proof. Let vi € V be such that (f — a)"~!(v1) #0. Then myy, = (t— a)”. 
If V=Vyy, and we take kj = n, then we are done. If V # Vy y,, then there 
is an f-invariant subspace W of dimension dim V — n such that V = Vy, BW, 
by Lemma 4.2.13. Let g : W—-— W be the endomorphism induced by f on 
W. Clearly, m;(t) = (t- a)” is a g-annihilator. Consequently, m,(t) divides 
my(t) = (t- a)”. This means that m,(t) = (t— a)" where m <n. Now, since 
dim W = dim Y — n < dim VY, we can finish the proof using induction. oO 


From the above theorem and Theorem 4.2.7 we obtain the following impor- 
tant result. 


Corollary 4.2.15. Let V be a finite dimensional vector space and let 
f:V¥— ¥V be an endomorphism. If mz(t) = (t- a)" for some integer 
n>1anda€ K, then there are integers 1 < ky < ++: < kj = n and 
nonzero vectors V1,...,Vr such that 


V = @ Span {(f — a)*1v;, (f — a) v;,...,(f — a)vy, vy} 
j=l 


where for every j € {1,...,r} we have (f — a)*iv; = 0 and (f — 
a)ki~!v; £0. Moreover, the set 


B= U {(f — a)*— v5, (f — a)" ?vy,...,(f - a)v;,v;} 
j=l 


is a basis of V and the B-matria of f is 


0 


Jb ti 


4.2. JORDAN CANONICAL FORM 265 


Definition 4.2.16. Let a € K. We say that a matrix is in a-Jordan 
canonical form if it has the form 


Jock 


where ki,...,k, are integers such that k, > 
Jaki: Fa,ko1+++;Ia,k, are Jordan blocks. 


If it is clear from the context what a is, then instead of “a-Jordan canonical” 
form we simply say “Jordan canonical form”. 


Here is an example of a matrix in a-Jordan canonical form 


a 1 0/0 0 0/0 0} 0 
0 a 1/0 0 O}]0 OF 0 
0 0 alO 0 O0}]0 OF 0 
0 0 OJa@ 1 O0}]0 OF 0 
0 0 0/0 a 1}/0 OF 0 
0 0 0/0 0 ajl0 OF; 0 
0 0 0/0 0 OJa 1] 0 
0 0 0/0 0 O0}]0 O]a 
In this example ky = ko = 3,k3 = 2,k4 = 1 and 


a l 
Jokes = a,ke — 0 a 
0 0 


The result in the following theorem is useful when computing Jordan canon- 
ical forms. 


266 Chapter 4: Reduction of Endomorphisms 


Theorem 4.2.17. Let V be a vector space, f : V > V an endomorphism, 
anda € K. If, for some integer k > 2, vi,...,Vm € ker(f — a)* are 
linearly independent vectors such that 


ker(f — a)* = ker(f — a)*~1 @ Kv1 ©-:- @ Kvm, 


then the vectors (f—a)vi,...,(f-—@)Vm are linearly independent vectors 
in ker(f — a)*-1 and the sum 


ker(f — a)” 


is direct. 


Proof. If w € ker(f — a)*~? and 
w+a(f—a)vit:::+2m(f—a)vm =0 
for some %1,...,%m € K, then 
0 = (f —a)*?w tai(f —a)* vy +--+ +b am(f — a)" vm 
=m =e) Ws ee al 
=(f —a) “(aivi +++ + amv) 


and thus 


Livi +++ + 2mVm = ker(f — a, 


Since ker(f — a)* = ker(f — a)*-! @ Kvi © --- ® Kvm, we must have x1v1 + 
+++ + 2mVm = 0. Thus 71 =--- = tm = 0, because the vectors v1,...,Vm are 
linearly independent, and consequently w = 0. O 


Example 4.2.18. Let V be a vector space and let f : V — V be an endo- 
morphism such that cf(t) = (a — t)® and m¢(t) = (t— a)? for some a € K. 
If 

dimker(f—a)=4 and dimker(f — a)? =7, 


show that there is a basis B of V such that the b-matrix of f is 


a 1 070 0 0/0 0} 0 
O a 1/0 0 0];0 070 
0 0 alO0 0 0};0 O07]; 0 
0 0 OJa 1 0;0 OF 0 
A=]/0 0 0/0 a 1/0 Of 0 
0 0 0/0 0 a}0 0]; 0 
0 0 0/0 0 O}Ja@ 140 
0 0 O00 0 O0}0 Olja 


4.2. JORDAN CANONICAL FORM 267 


Solution. Let u and v be linearly independent vectors in V such that 


ker(f — a)? 6 Ku 6 Kv = ker(f — a)? = V. 


According to Theorem 4.2.17 the sum 


ker(f — a) + K(f — a)u+ K(f —a)v 


is direct. Note that 


ker(f — a) + K(f — a)u+ K(f — a)v C ker(f — a)?. 


Because dim ker(f —@) = 4 and dim ker(f — a)? = 7, there is a p € ker(f — a)? 
such that 


ker(f — a) @ K(f — a)u@ K(f — a)v @ Kp = ker(f — a)”. 


The vectors 
Gi _ a)?u, (f ~~ a)’v, Gs a a)p 


are in ker(f — qa), that is, are eigenvectors of f and they are linearly indepen- 
dent, by Theorem 4.2.17 (with k = 2). 
Because dim ker(f — a) = 4 we can find an eigenvector q such that 


K(f — a)°u@ K(f — a)?v 6 K(f — a)p © Kq = ker(f — a). 


Consequently, 


V=K(f —a)*ue@ K(f —a)u@ Ku@ K(f — a)*v @ K(f — a)v 6 Kv 


and 


B= {(f ai a)*u, (f a; a)u, u, (f ae a)*v, Me _ av, V; ts a a)p, Pp, qa} 


is a basis of V. It is easy to verify that A is the B-matrix of f. oO 


Example 4.2.19. Let V be a vector space and let f : V — V be an en- 
domorphism. If my(t) = t® and there are positive integers q,r,s such that 
dim V = 3q¢+ 2r+s, dimker f? = 2q¢+2r+s, and dimker f = q+r-+s, show 
that there are linearly independent vectors v1,...,Vg,W1,.-.,Wr,U1,..-,Us 
such that Y is a direct sum of the following g+7r-+s cyclic subspaces 


Span { f?(vi), f(vi), vi} see popand f (Va) fa) Vat 


Span {f(wi),wi},...,Span{f(w,.), w,} 
Span{u;},...,Span{u, } 


268 Chapter 4: Reduction of Endomorphisms 


Solution. There are linearly independent vectors vj,...,Vq such that 
V = ker f? =ker f? 6 Kv © --- @ Kvy. 


By Theorem 4.2.17, the sum ker f + Kf(vi) +---+Kf(wv,) is direct and it is 
a subspace of ker f?. Let w1,...,w, € ker f? be linearly independent vectors 
such that 


ker f? = ker f @ Kf(vi) ©--- @Kf (vq) 6 Kwi © --- 6 Kw,. 


Again by Theorem 4.2.17, the vectors f*(v1),..., f?(vq), f(wi),---,f(wr) are 
linearly independent vectors in ker f. Now, let uy,...,u, be linearly indepen- 
dent vectors such that 


ker f = Kf?(vi) @--- OK f?(vq) O Kf (wi) @--- OK f(w,) @ Ku ©: O Kug. 


Consequently, the vector space VY is a direct sum of the following q+r+s 
cyclic subspaces: 


Span { f?(vi), f(v1), vi} ieee, SPAN f°(Va) f( Vali Val 


Span {f(wi),wi},...,Span{f(w,.), w,} 
Span{u;},...,Span{us } 
oO 


It is worth noting that Theorems 4.2.14 and 4.2.15 can be obtained from 
Theorem 4.2.17. In the next theorem we give a slightly different formulation of 
these results and use Theorem 4.2.17 to prove it. 


Theorem 4.2.20. Let V be a finite dimensional vector space and let 
f :V¥— VY be an endomorphism. If m;(t) = (t—a)* for somea € K 
and integer k > 1, then V is a direct sum of subspaces of the form 


Span {(f —a)~"v,...,(f—a)v,v} 


for some integer m > 1 andv € V such that mpy = (t—a)”™. 


Proof. Recall that m;(t) = (t — a)* means that V = ker(f — a)* and V # 
ker(f — a)*-1. Similarly, myv = ({-a)™ means that (f — a)™v = O and 
(f —a)™-1v £0. 
Let 
By, = {vVi1; ines Vim, } 


be a basis of a complement C; of ker(f — a)*~! in ker(f — a)* = V, that is, 


V =ker(f — a)” = ker(f —a)*-! @ Cg. 


4.2. JORDAN CANONICAL FORM 269 


By Theorem 4.2.17, there are vectors vg_1,1,---;Vk—1,m,_, € ker(f —a)*7} 
such that the set 6,1 consisting of vectors 


(f —a)vna,---,(f — @)VEma; 


Vk-1,15+++) Vk-1,mp_-1 


is a basis of a complement Cx_1 of ker(f —a)*~? in ker(f —a)*~1. Consequently, 


V =ker(f — a)*-? © Cy_1 @Cg. 


Next, there are vectors Vp_2,1,---,Vk—2,m,_» € ker(f — a)*~? such that the 
set 6,2 consisting of vectors 
2 2 
(f _ a) Vkjls-+ + (f 7 a) Vkymp> 
(f = Q)VE-1,1; sey (f = Q)V itn 15 
Vk-2,15+++) Vk-2,mp_ 2) 


is a basis of a complement Cx_2 of ker(f — a)*~3 in ker(f — a)*~?, again by 
Theorem 4.2.17, and we have 


V = ker(f — a)*-3 © Cy_2 ®Cr_-1 OCk. 


Continuing as above we eventually obtain vectors v2,1,.-.,V2,m. € ker(f — 
a)? such that the set By consisting of vectors 


(f a a) eve, eee) (f = alee g oe 


(f = a)*3v_R_4 4, | (f = oma see 


(f ~~ Q)V3,1, ae (f = Q)V3,ms: 
V2,1,+++5V2,me; 


is a basis of a complement C2 of ker(f — a) in ker(f — a)?. 


Finally, there are eigenvectors V1,1,...,;Vijm, Of f such that the set B, con- 
sisting of vectors 


(f - a) ved, ast ly (f _ a a ee 


(f — Q)vo1, pent (f _ Q)V2,mo; 
Vi,1; Rvag >Vismis 


is a basis of the eigenspace ker(f — a). 


270 Chapter 4: Reduction of Endomorphisms 


Now, since 


the set of the vectors 


By, U Bp, U-:-UB2U By, 


is a basis of V. This basis contains m, sets of vectors 


{(f — a)" ve,a, (f —a)*-? vin, seg UF =O), Vea} 


{(f - aa Te (Fa) Viens a) f= Q)Vkymes Vke,m, } 


each with k elements, which will generate m, Jordan blocks Jo,,4, mp1 sets of 
vectors 


{(f — a)*-?ve_-11, (f —a)F8vp-ai, ---, (fF —@)ve-131, VE—-1,1} 


Lif =) Vie satiey cs (f — a)*-3 Veta a rsehiy, (f — Q)VE—-1ymp_1 Viicticat 


each with k — 1 elements, which will generate m,—; Jordan blocks Jq,4-1, 


j2 sets of vectors 


{(f -@)va1, var} 


ata _ Q)V2,m2 ’ Viens} 
each with 2 elements, which will generate mz Jordan blocks Jq,2, and mj, eigen- 
vectors of f 
Vivi 


Vivmi 


which will generate m; Jordan blocks Jq.1. O 


Example 4.2.21. Let V be a vector space and let f : YV — V be an endomor- 
phism. If c(t) = (a—t)° and m f(t) = (t— a)? for some a € K, determine all 
possible Jordan canonical forms associated with such an endomorphism f. 


4.2. JORDAN CANONICAL FORM 


Solution. It is easy to see that these forms are 


271 


In the first case we have dim ker(f — a) = 3 and we have three Jordan blocks 
and in the second case we have dimker(f — a) = 4 and there are four Jordan 


blocks. 


Oo 


Example 4.2.22. Let V be a vector space such that dim V = 9 and let f : 


Y > VY be an endomorphism. If 


dim ker(f —a) = 3, dimker(f—a)* =5, dimker(f—a)*? = 7, ker(f—a)* = 8, 


and m+(t) = (t — a)°, determine the Jordan canonical form of f. 


Solution. Following the proof of Theorem 4.2.20 or Theorem 4.2.17 it is easy 
to verify that the Jordan canonical form of f is 


(Ss) oy oS) (oie i) 32) 
[Si io) (oy i=) oie ie 12) 
Se Se) ee) SS 


Example 4.2.23. We consider the 


f(x) = Ax where 


=i 

—4 

A= tf 
—3 

G 


Se) eonS je) es) Se = 


2 
3 
=2 
1 
=2 


(s)he eis) (oT aaa) 


endomorphism f 


—14 
—7 
13 
—5 
12 


ee eo eee = 


See eo iearie) =) i) 


S) 


—24 
—16 
22 
—9 
22 


o) io) (SS (oo) oie) oy) 


—4 
—3 
4 
—2 
5 


ore [oso orore 


: R° — R° defined by 


272 Chapter 4: Reduction of Endomorphisms 


Knowing that cr(t) = (1 — t)°, determine my and a basis B of R° such that 
the B-matrix of f has a Jordan canonical form. 


Solution. Since 


1 0 0 0 0 —8 2 —-14 —24 —4 
01 0 0 0 -4 2 -7 -16 -3 
B=A-|]0 010 0]= 7 -—2 1 22 A 
0 0 0 1 0 —-3 1 -5 -10 -2 
000 0 1 2, 1 22 4 
2 0 2; 4 2, 
2 0 P 4 2 
Be = 2 0 —2 -4 -2 |, 
1 0 i De 1 
2 0 De 4 2 
and 
0 0 0 0 0 
0 0 0 0 0 
B=!/]0 00 0 0 ; 
0 0 0 0 0 
0 0 0 0 0 
we have m+(t) = (t — 1)?. 
The set 
—1 —2 —1 0 
0 0 0 1 
eal eels ne Neel ab 
0 1 0 0 
0 0 il 0 
is a basis of ker(f — 1)? and the set 
—2 —2 
1 4 
Teale aa 
0 il 
1 0 


is a basis of ker(f — 1). 


It is easy to see that is not in ker(f — 1)? and € ker(f —1)? 


ooococrF 
eet =] 


4.2. JORDAN CANONICAL FORM 273 


is not in 
1 —2 —2 —8 —2 —2 
0 1 4 —4 1 4 
Span< (f—1)] 0 |, iL ils 0 = Span fs ik | 0 
0 0 1 —3 0 1 
0 1 0 7 1 0 
Consequently the set 
1 1 1 —1 —1 
0 0 0 0 0 
0 0 0 0 0 
0 0 0 0 0 
2 —8 1 —6 —1 
2 —4 0 =3 0 
= =) || (Cam | eee | ea |e Ils 1 
1 —3 0 =, 0 
—2 7 0 5 0 


is a basis of R° with the desired properties, and the B-matrix of f is 


Example 4.2.24. Let f : YV — V be an endomorphism on a vector space V 
and let a € K. If B = {vi, va, v3, V4, V5, V6, V7, Vg, Vo} is a basis of V such 
that the B-matrix of f is 


a 1 0 0/0 0 0);0 O 
0 a 1 0/0 0 O0]0 O 
0 0 a 1/0 0 O0]0 O 
0 0 0 a}0 0 O00 O 
0 0 0 O}Ja 1 0/0 OF], 
0 0 0 0/0 a 1/0 O 
0 0 0 0/0 0 al]O 0 
0 0 0 0/0 0 Ofa 0 
0 0 0 0);0 0 0/0 a 


274 Chapter 4: Reduction of Endomorphisms 
find bases of 

ker(f — a), ker(f —a)? and ker(f — a)? 
and the polynomial m,. 


Solution. Since 


f(vi) = evi, 
f(v2) = vi + ava, 
f(v3) = V2 + avs, 
f (va) = v3 + ava, 
f (vs) = avs, 
f (ve) = Vs + ave, 
f(v7) = Ve + avz, 
f(vs) = avs, 
f(v9) = avo, 


{vi,V5, Vs, V9} is a basis of ker(f — a), 

{v1, V2, V5, V6, Vg, Vo} is a basis of ker(f — a)?, 

{v1, V2, V3, V5, V6, V7, V8, Vo} is a basis of ker(f — a)3, 
and 

my(t) = (t—a)*. Oo 


4.2.2 Uniqueness of the Jordan canonical form when the 
characteristic polynomial has one root 


In this section we present a formula for the number of Jordan blocks in the 
Jordan canonical form of an endomorphism with the characteristic polynomial 
that has only one root. This result gives us a uniqueness theorem for the Jordan 
canonical form for such endomorphisms. First we need some preliminary results. 


Lemma 4.2.25. Let U be a finite dimensional vector space and let g : 
u — U be an endomorphism. If the Jordan canonical form of g is a 


Jordan k x k block Jo,n for some a € K, then dimran(g — a)” =k—m 
forl1<m<k-—1 and dimran(g — a)" =0 form>k. 


Proof. Let B = {vi,...,v%} be a basis of U such that the B-matrix of g is the 
Jordan k x k block Jax. Then 


(g— @)(v1) = 0, (g—@)(v2) = vi, -.., (9— @)(Ve) = Ve-1-. 
This means that ran(g — a) = Span{vi,...,Vvz—-1}. Since 


(g—a)?(v1) =0, (g—a)? (v2) = 0, (g—a)*(vs) = Vi, +--+, (g—a)? (vi) = Vk-2, 


4.2. JORDAN CANONICAL FORM 275 


we have ran(g — a)? = Span{v1,..., Vk—2}- 

Continuing the same way we get ran(g — a)” = Span{vj,...,Ve—m} for 
m <k—1land ran(g—a)™ = 0 form > k. This gives us dimran(g—a)” = k—m 
for 1<m<k—1 and dimran(g — a)” = 0 for m> k. O 


Lemma 4.2.26. Let V be a finite dimensional vector space and let 
f:V—- VY be an endomorphism. If the diagonal Jordan blocks of the Jor- 
dan canonical form of f are Jajny,+-+;Jan, for some positive integers 
N1,+++,Np, then 


dim ran(f —a)*~! = {number of Jordan blocks Ja,~} + SP (ng—k+1), 


Ng>k 


for every integer k > 1. 


Proof. We have 
VH=N8---OUp, 


where, for every q € {1,...,p}, Ug is a vector subspace of V with a basis B, such 
that the B,-matrix of the endomorphism f, : U4, — U, induced by f on U, is 
Jon, For any vectors v € V,u; €Uj,...,Up € Up such that v = uy +--+ + Up, 
we have 


f"(v) = FP (ur) +--+ + fey); 
for every integer m > 1. By Lemma 4.2.25, 
dim ran(f, — a)*-1 = 0 if ng < k, 
dim ran(f, — a)*-! = 1 if ng =k, 
dim ran(f, — a)" =n, —-(k-1)=ng—k+1ifng>k. 


The desired result is a consequence of these equalities. O 


Lemma 4.2.27. Let V be a finite dimensional vector space and let 
f:V— V be an endomorphism. If the diagonal Jordan blocks of the Jor- 
dan canonical form of f are Jajny,--+;Ja,n,, for some positive integers 
N1,+++,Np, then 


{number of Jordan blocks Jax} 


= dimran(f — a)*~! + dimran(f — a)**1 — 2dimran(f — a)*, 


for every integer k > 1. 


276 Chapter 4: Reduction of Endomorphisms 


Proof. By Lemma 4.2.26 we have 


dim ran(f — a)*~! = {number of Jordan blocks Jay,~}+ S- (ng -k +1) 


Ng>k 


and by Lemma 4.2.25 we have 


dim ran(f — a)* = S- (ng—k) = S> (nq — &) 


Ng>kt+l Ng>k 

and 

dimran(f—a)*1= S* (ng—k-1)= So (ng—k-1) = SO (ng—k=- 1). 
Ngokt2 Ng>kt+1 ng>k 

Consequently 


dim ran(f — a)*~1 + dimran(f — a)*t! — 2dimran(f — a)* 


= {number of Jordan blocks Jo,~} + S- (ng-k+1)+ S- (ng -—k-1) 


Ng>k Ng>k 
~2S° (ng —k) 


Ng>k 


= {number of Jordan blocks Jo, }. 


Theorem 4.2.28. Let V be a finite dimensional vector space and let 
f :V— VY be an endomorphism. If the diagonal Jordan blocks of the 
Jordan canonical form of f are Jany;++-;Ja,n,, for some positive inte- 
GeTS N1,.-.,Np, then 


{number of Jordan blocks Jax} 
= 2dimker(f — a)* — dimker(f — a)*—1 — dimker(f — a)**", 


for every integer k > 1. 


Proof. This equality is an immediate consequence of Lemma 4.2.27 and the 
Rank-Nullity Theorem 2.1.28. O 


Corollary 4.2.29. Let V be a finite dimensional vector space and let 
f : VY —- V be an endomorphism. If the minimal polynomial of f is 


my = (t—a)” for some integer n > 1, then the number of Jordan blocks 
in the Jordan canonical form of f is dimker(f — a). 


4.2. JORDAN CANONICAL FORM 277 


Proof. According to Theorem 4.2.28 the number of n x n Jordan blocks Jon, is 
2 dim ker(f — a)" — dim ker(f — a)"*+ — dim ker(f — a)”~! 
= dim ker(f — a)” — dimker(f — a)"~", 
the number of m x m Jordan blocks Jo,m, where m € {2,...,n — 1}, is 
2 dim ker(f — a)” — dimker(f — a)™t* — dimker(f — a)?" 


and the number of 1 x 1 Jordan blocks Jai, that is the blocks which are eigen- 
values, is 


2 dim ker(f — a) — dim ker(f — a)? — dimker(f — a)° 
= 2dimker(f — a) — dimker(f — a)? — kerId 
= 2dimker(f — a) — dimker(f — a)’. 
Consequently the number of Jordan blocks is 
dim ker(f — a)" — dimker(f — a)"~1 
+ 2dimker(f — a)”~! — dimker(f — a)” — dimker(f — a)"~? 
+++»+4+2dimker(f — a)? — dimker(f — a)? — dimker(f — a) 
+ 2dimker(f — a) — dimker(f — a)? 
= dimker(f — a). 


4.2.3. Jordan canonical form when the characteristic 
polynomial has several roots 


We begin by considering an example. 


Example 4.2.30. Let V be a 4-dimensional vector space and let f : Y¥V > V 
be an endomorphism. If the characteristic polynomial of f is (t — a)?(t — 8)? 
for two distinct numbers a@ and { and 


dim ker( f — a) = 1, dimker(f — a)? = 2, dimker(f — 8) =2, 


show that there is a basis 6 of V such that the B- matrix of f is 


Serene) 2) 
oo 2 j— 
o@moo 
IS So. S 


Solution. Let u be a vector in ker(f — a)? which is not in ker(f — a) and let 
{v, w} be a basis of ker(f — 3). Then (f —qa)u is a nonzero vector in ker(f —a). 


278 Chapter 4: Reduction of Endomorphisms 


We will show that 
B={(f—a)u, u,v, w} 


is a basis with the desired properties. 
Let 
vi(f —a)u+a2Q.u+a23v+24w =0 (4.7) 


for some 21, 22,273,274 € K. By applying (f — a)? to (4.7) we get 


ai(f — a)*ut 22(f — a)?(u) + 23(f — a)*(v) + va(f — a)?(w) = 0 


and thus 
x3(f — a)?(v) + 4(f — a)?(w) =0. 


Hence x3 = 14 = 0 because 


Ne vay 70) Rees 


Next, using the fact that 73 = 24 = 0 and applying f — a to (4.7), we get 
x2 = 0 and then x; = 0. This shows that the vectors (f — a)u, u, v, and w 
are linearly independent and consequently {(f — a)u, u, v, w} is a basis of R*. 


Since 
f((f —@)(u)) = (f — a)?(u) + a(f — a)(u) = a(f — a)(u), 
f(u) = (f — a)u + au, 
f(v) = By, 
f(w) = Bw, 
the B-matrix of f is A. O 


The next two results follow from Theorems 4.2.14 and 4.1.45. 
Theorem 4.2.31. Let V be an n-dimensional vector space and let f : 
VY — V be an endomorphism. If 

ert) (Ay 8) A) 


for some distinct \1,...,A”, € K and some positive integers r1,...,Tk, 
then there are vectors X1,...,Xm € V such that 


V=E Vix, B+: BDV xin 


and for every j € {1,...,m} there is anl € {1,...,k} and an integer 
qg <7, such that 


Mex, (t) = (6 — ry)™. 


4.2. JORDAN CANONICAL FORM 279 


Theorem 4.2.32. Let V be an n-dimensional vector space and let f : 
VY — V be an endomorphism. If 


Cnt (Abt) Ag 1) 


for some distinct \1,...,A”x € K and some positive integers r1,...,1k, 
then there is a basis B of V such that the B-matrix of f is of the form 


pa a 


and for every m € {1,...,r} there is an integer q € {1,...,k} such that 
Jm is in Aqg-Jordan canonical form. 


Definition 4.2.33. We say that a matrix is in Jordan canonical form if 
it has the form 


and for every k € {1,...,r} there is a number a, € K such that J, is in 
a;,-Jordan canonical form. 


Example 4.2.34. Let f : Y— V be an endomorphism and let a, 8,y,6 € K 
be such that 


and 


(f-al?=4, dimker(f—al)?>=3, dimker(f—el)=2, 
dimker(f — 81)? =2, dimker(f — I) =1, 
( 
( 


dimker(f — 67)? =2, dimker(f —6JI) =1. 


280 Chapter 4: Reduction of Endomorphisms 


By Theorem 4.2.31 there is a basis B of Y such that the B-matrix of f has 
following Jordan canonical form 


eco ooo 6 Oo eo 2 
See eres) eevee GF i= 
(Ss) ere) erie ere e) iS) 
ao oc oo oS 2 So S'S 
eooococo =o oo 6 
oocoocoG ero oc > 
eS See eS) eS SS 
ere) eae) ere) ete) S) 
SSS ea eee) a=) 
Co ea ene mia) I=) =) 


4.3 The rational form 


When considering Jordan forms we assumed that the minimal polynomial of the 
endomorphism splits over K. In this section we consider endomorphisms such 
that their minimal polynomial does not necessary split and construct bases for 
such endomorphisms for which the matrix has a simple form. We begin by 
considering an example. 


Example 4.3.1. Let V be a 4-dimensional vector space over R and let f : 
Vv — V be an endomorphism. If m(t) = (¢? + 1)?, show that there is a basis 
B of V such that the B-matrix of f is 


0 
0 
1 
0 


oor © 


Solution. Let v be a vector from V such v € ker(f? +1)? and v ¢ ker(f? +1). 
Then my,y = my and it is easy to see that 


B= {v, f(v), f7(v), f2(v)} 
is a basis-of V. Smee fiv) = fiv), Ff) = 77), FU) = Fey) and 
(f? +1)°(v) = f*(v) + 2f*(v) +1=0, 


we have 


f(P(v)) = f(y) = -1 - 2f7(v). 


4.3. THE RATIONAL FORM 281 


Consequently, the B-matrix of f is 


Note that 
mywv(t) = 1+ 20? +24 


and that the entries in the last column in the B-matrix of f are obtained from 
the coefficients of the polynomial m+, (t). O 


The following result generalizes the observation in the above example. 


Theorem 4.3.2. Let V be a finite dimensional vector space and let v € 
V be a nonzero vector. If f:V— V is a nonzero endomorphism such 
that 


my v(t) =agtayt+-:--+ apt! ao tt 


for some integer k > 1, then B = {v, f(v), ne ea is a basis of 
Vey and the B-matriz of the endomorphism fy : Vr — Ve induced by 
f is 
0000... 0 
000 ... 0 
0100... 0 


Proof. We already know that B is a basis of Vs. To show that A is the B-matrix 
of the endomorphism jf, it suffices to observe that 


F(f® 1) = f¥(v) = -aov — arf(v) — ++ — aga f*(v). 


282 Chapter 4: Reduction of Endomorphisms 


The next theorem is a form of converse of the above theorem. Note that it 
implies that every polynomial is a minimal polynomial of some endomorphism 
on a finite dimensional vector space. 


Theorem 4.3.3. Let V be a finite dimensional vector space and let B = 
{vi,..., Ve} be a basis of V. If f: V > V is an endomorphism such that 


Pe ree | 
0 


OM Osse22 0 —ag 


00... 0 —Qk—3 
OOP ac: 0 —ArK-~2 
O20) 2: 1 —Qk-1 


is the B-matriz of f for some ao, a1,...,@x—-1 € K, then vj = f?~!(v1) 
for j © {2,...,k} and 


™f,v, (t) = ms (t) =ajptayt+---+ Gat 44. 


Proof. Since 


ives) Svea 7's) 
f (ve) = £(f* (v1) = f* (v1) = -aovi — arf (vi) — +++ — ani f** (v1), 


we have 
aovi tai f(vi) +--+ + axn-1f* 1 (v1) + f* (v1) = 0. 


Linear independence of vectors v,,...,V% implies that 
mv, (t) =ao tat +--+ ap_ith-! +4. 
Moreover, 


Myvi (v2) = mMzvi(F(v1)) = flmzvi (vi) = 0 


™ ivi (vi) = M fini pray) = ie ee (vi) = 0, 


which shows that my, is an annihilator of f and mf = mp. O 


4.3. THE RATIONAL FORM 283 


Definition 4.3.4. Let ao,... The matrix 


0000... 00 —az_3 


000... 1 0 —ax—2 
0000... 01 ~—ag-i 


is called the companion matrix of the polynomial 
p(t) = ao + ayt fees t az_it* ais tk 


and is denoted by cp. 


In the next example we consider a matrix where instead of Jordan blocks we 
have companion matrices. 


Example 4.3.5. Let V be a vector space such that dimV = 5. If f: Vo V 
is an endomorphism such that c(t) = (t— a)? and m,(t) = (a—t)?, for some 
a € K, and 

dimker(f —a)=2 and dimker(f — a)? =4, 


show that there is a basis B of V such that the B-matrix of f is 


CU =a. 0e0 
1-0 =ga7 0° 0 
01 3a 0 O 
00 0 0 —a? 
00 0 1 2a 


Solution. Let u and v be the vectors from Example 4.2.12. We will show that 


B= {u, f(u), f?(u),v, fv) 


is a basis of V with the desired property. 
First we note that 


Span{(f ar a)? u, (f eas a)u, u} an Span{u, f(u), f (u)} 


and 
Span{(f — a)v, v} = Span{v, f(v)}. 


284 Chapter 4: Reduction of Endomorphisms 


Now, since {(f — a)?u, (f — a)u, u, (f — a)v, v} is a basis, the set {u, f(u), 
f?(u), v, f(v)} is also a basis. 
From 


(f — a)°(u) = f*(u) — 8af?(u) + 307 f(u) — a®u =0 


we get 


f(f?(u)) = fP(u) = ofu — 30” f(u) + 3af?(u). 


Since we also have 


the B-matrix of f is 


Theorem 4.3.6. Let V be a finite dimensional vector space and let 
f:V— VY be a nonzero endomorphism such that ms = pq, where p 


and q are two monic polynomials such that GCD(p,q) = 1. If g is 
the endomorphism induced by f on ker p(f) and h is the endomorphism 
induced by f on kerq(f), then mg =p and mp, = @. 


Proof. Let a and b be polynomials such that ap + bg = 1. Since m,(g) = 0, 
we have m,(g)b(g)q(g) = O and thus m,(f)b(f)¢(f)(x) = O for every x € 
ker p(f). Moreover, mg(f)b(f)a(f)(y) = 0 for every y € ker q(f). Consequently, 
mg(f)b(f)q(f) = 0, because every vector from V is of the form x + y where 
x € kerp(f) and y € kerq(f), by Theorem 4.1.44. Clearly, ms = pq divides 
m,bq = m,(1 — ap). Consequently, p divides m,(1 — ap), which implies that p 
divides mg, because GCD(p, 1 — ap) = 1. 

For every x € ker p(f) we have p(g)(x) = p(f)(x) = 0, which implies that 
mg, divides p. Since the monic polynomials p and my divide each other, we have 
p=mg. Ina similar way we can show that q = mp. O 


In Theorem 4.2.7 we show that, if V is a finite dimensional vector space and 
f : ¥ > V is an endomorphism such that my(t) = (t — a)* for some a € K, 
then there is a vector v € V such that my = my. Now we can show that the 
assumption that m(t) = (t — a)* is unnecessary. 


4.3. THE RATIONAL FORM 285 


Theorem 4.3.7. Let V be a finite dimensional vector space. For every 
nonzero endomorphism f :V— V there is a nonzero vector v € V such 


that 
my =™Mpv- 


Proof. Let my = p{? ... pz" where p1,...,px are irreducible monic polynomials 
and aj,...,Q@% are positive integers. By Theorem 4.3.6, for every j € {1,...,k} 
the polynomial p;? is the minimal polynomial of the endomorphism f; induced 
by f on ker p;” (f). By Theorem 4.1.44, we have 


V=kerp?! ®--: @ ker py. (4.8) 


For every j € {1,...,4} we choose a vector vj € ker p;’ (f) such that v; ¢ 
ker p;? = f). Because the polynomial p; is irreducible this means that myy, = 
p;’. Now, since 
0 = mpuvttvylf)(v1 ++ + VR) 
= NF vit--t+ve (f)(v1) tb Mp vy tote (f) (vr) 


and because mpyv,+--+v.(f)(vj) € ker p;’, it follows from (4.8) that 
Myvi+--+v,(f)(v;) = O for every j € {1,...,k}. Hence, using the equal- 
ity mpv, = ae it follows that the polynomial p;" divides the polynomial 
Myv,+--+v, (t). Consequently 


— mn@ Ak 
Mpvitetv, =P -.-D,* = MF 


because, by definition, my.y,+...4v, divides the product pf! ...p?*. Oo 


Example 4.3.8. Let V be a vector space such that dimV = 4 and let f : 
Y —> VY be an endomorphism. We assume that V = V; ® v2 where V; and V2 
are two invariant subspaces of f. Let f; : Vi; ~ V; and fo : V2 > V2 be the 
induced endomorphisms. If ms, = t? and my, = t? —t + 1, show that there is 
a basis B of V such that the B-matrix of f is 


0 
0 
1 
0 


oor O&O 


Solution. Let v € V be such that 


Mpv = MF = MP, Ms, Sa ae 


286 Chapter 4: Reduction of Endomorphisms 


Then 
— {v, f(v), f?(v), f2(v)} 


is a basis of Y. Obviously, 


py) 
f(f(v)) = PO) 
F(F?(v)) = FP). 


Moreover, since 


we have 
FP) = HW) = PW) — £7). 


Consequently, the B-matrix of f is 


Theorem 4.3.9. Let V be a finite dimensional vector space and let f : 
VY — V be a nonzero endomorphism. There are v1,...,Vn € V such that 


V=Veyv, 8: OVey, 


andms,y, is a multiple of mp»v,,, for every j € {1,...,n — 1}. 


Proof. By Theorem 4.3.7, there is a vector v1 € V such that my(t) = my, (t) 
and, by Lemma 4.2.13, we have 


V=Vy~, B V2, 


where V2 is an f-invariant subspace of V. Now, there is a vector v2 € V2 such 
that my;,(t) = mpy~v.(t), where fo is the endomorphism induced by f on V2. 
Clearly, m,, divides my. Applying Lemma 4.2.13 again we get 


v= Vivi @D Vive © v3, 


4.3. THE RATIONAL FORM 287 


where V3 is an f-invariant subspace of V. Continuing this way we can produce 
the desired decomposition of VY. O 


Example 4.3.10. Let V, f, u, v, and w be as defined in Example 4.2.30. 
Show that 


VY = Vien, ® Vw: 
Solution. Since mf,u(t) = (t— a)? and myy(t) =t— 8, we have 
Mut = (t— a)*(¢— B) = my. 


Because Vpuiv C Vu © Vey and because dim Vy yi, = 3, we have Vpuiy = 
Viu © Vey. This gives us V = Veu © Vev © Ve w = Veutv © Vew- O 


In Theorem 4.3.9 we show that for any endomorphism / on a finite dimen- 
sional space V there are vj,...,Vn € V such that V = Vyy, B---OVyey, and 
mv; is a multiple of myy,,, for every j € {1,...,n —1}. In Theorem 4.3.12 
below we will show that, while the vectors v;,..., Vv, are not unique, the integer 


n and the polynomials mf,y,,...,ms,v, are unique. In the proof of Theorem 
4.3.12 we use the following lemma. 


Lemma 4.3.11. Let V be a finite dimensional vector space and let 
f :V—- V be a nonzero endomorphism. If, for a vector v € VY, the 
polynomial p is a divisor of mfy, then 


dim p(f)(Vrv) = dim Vy — deg p. 


Proof. We can assume that the polynomial p is monic and that p 4 my. Then 
PUM) Vrv) = Veen: 


Now, if g is a monic polynomial such that mp y = pq, then clearly q = 
Mf(f)v- This gives us 


dim p(f)(Vzv) = deg g = deg my — deg p = dim V+, — deg p. 


288 Chapter 4: Reduction of Endomorphisms 


Theorem 4.3.12. Let V be a finite dimensional vector space and let 
f:V—YV be a nonzero endomorphism. If 


VE Vier Oo OViv, 


for some nonzero vectors V1,...,Vn € V such that mpy, = my and 
mw, is a multiple of msv,,, for every j € {1,...,n—1} and 


V=Vyw, 8: OVew, 


for some nonzero vectors W1,...,Wq € V such that mpw, = mz and 
Mw, 8 a multiple of mrw,,, for every j € {1,...,¢—1}, thanqg=n 
and 

MEvs = ™M fiw; 


for every j € {1,...,n}. 


Proof. Let k be a positive integer such that k < n and k < q. Suppose that 
Mv; = Ms,w, for every j € {1,...,k —1}. Since 


myvi(f)(Viv;) = 0 
for every 7 € {k,...,n}, we have 
mv. (AVY) = mpv. A) Vev) B® Mev. f) Ve vea)s 

which give us 

dim myv,(f)(V) = dimmyv,(f) (Vivi) +++ + dim mpy,(f)(Vevi-1)- 
We also have 

mv. AVY) = mpv. NV iwi) 8 Ompv. A) Viwa) 

and thus 

dim myv,(f)(V) = dim mpv.(f)(Viwi) +++ + dimmyv. (f)(V ewe): 
By the assumptions and Lemma 4.3.11 we have 


dim myv, A) (Vev;) = dimmyvi(f)(V ews) 


for every j € {1,...,k—1}, because my y,, is a divisor of all polynomials myw, = 
Mpvir--+)Mfw,p_1 = Mfv,_,- Hence dimmyy,(f)(Ve,w,) = 0, which implies 
that 


MF vie (f)(Vew:) = 0. 


This shows that my w, divides my v,. 


4.3. THE RATIONAL FORM 289 


The same way we can show that my.y, divides mfw,- Consequently, mpv, = 
My,w,, because the polynomials my, and my w, are monic. We finish the proof 
by induction and using the equality 


dimV = dimVyy, +--+ +dimVyy, = dimVyw, +:+-+dimVyw,- 
O 


The polynomials myy,,...,my,v,, in Theorem 4.3.12 are called the invariant 
factors of f. 


Example 4.3.13. Let V be a finite dimensional vector space and let f :V — V 
be an endomorphism such that the invariant factors of f are 


(¢—1)?¢-2)2@-—3), @¢-1)@-2), and G€-—1)G—2). 


Determine the Jordan blocks from the Jordan canonical form. 


Solution. 
J13,J1,2, Ji 1, J2,3, J2,3, Jo1, J31- 


Definition 4.3.14. With the notation from Theorem 4.3.12 the matrix 
C, 


MFvy 


ee 


is called the rational form of f. 


Example 4.3.15. Let V be a finite dimensional vector space and let f : V — V 
be an endomorphism. If the Jordan canonical form of f is 


determine the invariant factors of f and the rational form of f. 


290 Chapter 4: Reduction of Endomorphisms 


Solution. The invariant factors are 


(t-—a)*(¢-— 8) =# — (Qa +B)? + (ae? +2a8)t-—a?8 and t—f. 


Consequently, the rational form is 


Example 4.3.16. Let V be a finite dimensional vector space and let f : V > V 
be an endomorphism. If the Jordan canonical form of f is 


ee oo 6 oo 6 cc o 
Sis ese eo a f= 
Seocea coc 2 = S&S 
Seo coq ot ea Ss 
Soo eS oo eo. a'> 
Sree} ay erie) 
S) eer tT] ere SSS) 
oo Soo So 6 eS 
2S. OC o.oo oS So 
Ne ieee eS eee) 


determine the invariant factors of f. 


Solution. The invariant factors are 


(t—a)¢—- fy t-y)?, (-a)t-y), and t—7. 


4.4 Exercises 


4.4.1. Diagonalization 


Exercise 4.1. Let V be a finite dimensional vector space and let f : V > V be 
an endomorphism. If U/ and W are f-invariant subspaces and V = U@ W, show 
that det f = det fr, det fr. 


Exercise 4.2. Let V be a finite dimensional vector space and let f : V > V 
be an endomorphism. If p, g, and r are polynomials such that GCD(p, ¢) =r, 
show that ker p(f) 9 ker q(f) = kerr(f). 


4.4, EXERCISES 291 


Exercise 4.3. Let V be a finite dimensional vector space and let f : V - V be 
an endomorphism. If p and qg are polynomials such that GCD(p, qg) = 1, show 
that ker(pq)(f) = ker p(f) © ker q(f). 


Exercise 4.4. Let V be a finite dimensional vector space and let f : V — 
Y be an endomorphism. Suppose that p1,...p, are polynomials such that 
GCD(p;,px) = 1, for j,k € {1,...,n} and 7 # k. Use Exercise 4.3 and 
mathematical induction to prove that 


ker(py -..Pn)(f) = ker pi(f) ® +++ @ ker pp(f). 


Exercise 4.5. Let V be a finite dimensional vector space and let f : V > V be 
an endomorphism. If \1,...,A, are distinct numbers, show that 


dim ker(f — 41) +---+dimker(f — Ax) = dim(ker(f — A1) +--+ ker(f — Ax)). 


Exercise 4.6. Let V be a finite dimensional vector space and let f : V > 
Y be an endomorphism. Suppose that pi,...,p, are polynomials such that 
GCD(p;,px) = 1 for j,k € {1,...,n} and 7 # k. Show that for every j € 
{1,...,2} there is a polynomial gq; such that q;(f) is the projection of ker p;(f) 
on ker(p; ...pn)(f) along 


ker pi(f) ®--- @ ker p;_i(f) @ ker pj41(f) @--: @ ker pn(f). 


Exercise 4.7. Let V be a finite dimensional vector space and let f : V > V 
be an endomorphism. If p(f) = 0 for some polynomial p, show that every 
eigenvalue of f is a root of p. 


Exercise 4.8. Let V be a finite dimensional vector space and let f : V > V 
be an endomorphism such that p(f) = 0 where p is a polynomial which can be 
written as a product of distinct monic polynomials of degree 1. Show that f is 
diagonalizable. 


Exercise 4.9. Let V be a finite dimensional vector space and let f : V > V be 
an endomorphism. Show that, if f? is diagonalizable and has distinct positive 
eigenvalues, then f is diagonalizable. 


Exercise 4.10. Let V be a finite dimensional vector space and let f :V— V 
be an endomorphism. If c¢(t) = (5 — t)?(3 — t)’, show that f is diagonalizable 
if and only if (f — 5)(f — 3) =0. 


Exercise 4.11. Let f : C3? — C3 be the endomorphism defined by 


x 0 0 14] Ia 
Fi iypp=Hy_t Oo Of fy 
z 0 1 0} lz 


Determine cf and my and show that f is diagonalizable. 


292 Chapter 4: Reduction of Endomorphisms 


Exercise 4.12. Let f : C? — C? be the endomorphism defined by 


r z}\ _ Ti —2 x 
y}) | -25 -Ti | fyl 
Show that f is diagonalizable and determine the eigenspaces of f. 


Exercise 4.13. Let V be a finite dimensional vector space, f : ¥V — V an 
endomorphism, and g : V > V an isomorphism. Show that cr = Cgfg-1. 


Exercise 4.14. Let V be a finite dimensional vector space, f : V — V an 
endomorphism, and g : V —- Y an isomorphism. Show that mf = mgfg-. 


Exercise 4.15. Let f : R® — R® be the endomorphism defined by 


x —3 43 —-17 x 
f y =|-4 29 —10 Yy 
z —8 60 —21 z 


Find cy. 
Exercise 4.16. Show that the endomorphism f : R? > R? defined by 


x =) 43. 21h) [x 
fl lyl | =| -4 29 -10 | ly 
z -8 60 —21 | |z 


is not diagonalizable and find the eigenspaces of this endomorphism. 


Exercise 4.17. Let V be a finite dimensional vector space and let f:V— V 
be an endomorphism. Show that 0 is a root of the minimal polynomial my, if 
and only if f is not invertible. 


Exercise 4.18. Let V be a finite dimensional vector space and let f:V— V 
be an endomorphism. If // is an f-invariant subspace of V such that dimU = 1, 
show that every nonzero vector in Y/ is an eigenvector. 


Exercise 4.19. Let V be a finite dimensional vector space, f,g : V7 V 
endomorphisms, and a € K. If a is an eigenvalue of f and g?—3g? f +39 f?—f? = 
0, show that a is an eigenvalue of g. 


Exercise 4.20. Let V be a finite dimensional vector space and let f,g;V — V 
be endomorphisms such that fg = gf. If mr = t/ and m;¢ = t* for some positive 
integers 7 and k, show that myg = t” and m+, = t* for some positive integers 
rand s. 


Exercise 4.21. Let V be a finite dimensional vector space and let f :V— V 
be an endomorphism. If the characteristic polynomial of f is cr (t) = 5t? + 7¢? + 
2t +8, show that f is invertible and find f~?. 


Exercise 4.22. Let V and W be vector spaces. Let f : V — V be an endo- 
morphism and let g : W —> VY be an invertible linear transformation. Show that 
d € K is an eigenvalue of f if and only A is an eigenvalue of g~! fg. 


4.4, EXERCISES 293 


Exercise 4.23. Let V be a finite dimensional vector space and let f : V + V be 
an endomorphism. If U/ is an f-invariant subspace of V, show that my, divides 
™f. 


Exercise 4.24. Let V be a finite dimensional vector space, f : VY — V an 
endomorphism, and YU an f-invariant subspace of V. Let v,w € V be such 
that v+w € U. If v and w are eigenvectors of f corresponding to distinct 
eigenvalues, show that both vectors v and w are in U. 


Exercise 4.25. Let V be a finite dimensional complex vector space and let 
f : ¥ + VY be an endomorphism such that f4 + Id = 0. Show that f is 
diagonalizable. 


Exercise 4.26. Let V be a finite dimensional vector space, f : V > V an 
endomorphism, and \ a nonzero number. If f is invertible, show that A is an 
eigenvalue of f if and only if + is an eigenvalue of f~+. 


Exercise 4.27. Let V be a finite dimensional vector space, f : ¥V — V an 
endomorphism, and v € VY an eigenvector of f corresponding to an eigenvalue 
\. We assume that a, € K are such that y 4 a and y 4 @. If u € ker((f — 
a)(f — B)) and u+ v = 0, show that v = 0. 


Exercise 4.28. Let V be a finite dimensional vector space and let f:V— V 
be an endomorphism. Show that 0 is an eigenvalue of f if and only f is not 
invertible. 


Exercise 4.29. Let V be a finite dimensional vector space and let f : V - V be 
an endomorphism. If U/ is an f-invariant subspace of V and f is diagonalizable, 
show that fy is diagonalizable. 


Exercise 4.30. Let V be a finite dimensional vector space, f : VY — V an 
endomorphism, and a € K. If dimY = n and there is an integer k > 1 such 
that (f — a)* = 0, determine the characteristic polynomial of f. 


Exercise 4.31. Let V be a finite dimensional complex vector space and let 
f : ¥ — V be an endomorphism such that f? — 2f +5 = 0. What are all 
possible polynomials m ¢? 


Exercise 4.32. Let V be a finite dimensional complex vector space and let 
f : ¥ > ¥V be an endomorphism such that f? — id = 0. Show that f is 
diagonalizable. 


Exercise 4.33. Let V be a finite dimensional vector space and let f :V— V 
be an endomorphism. If UY is an f-invariant subspace of V, v,w € V are such 
that v,v + w € U, and w is an eigenvector of f corresponding to a nonzero 
eigenvalue a € K, show that w € U. 


Exercise 4.34. Let V be a finite dimensional complex vector space and let 
f :VY-— VY be an endomorphism. If cs = pq, where p and q are polynomials such 
that GCD(p,q) = 1, and g and h are the linear transformations induced by 
f on kerp(f) and ker q(f), respectively, show that there are nonzero a, 3 € C 
such that p = acg and q = (cp. 


294 Chapter 4: Reduction of Endomorphisms 


4.4.2 Jordan canonical form 


Exercise 4.35. Let V be a finite dimensional complex vector space and let 
f :VY— ¥Y be an endomorphism. If &/ is an f-invariant subspace of V and 
v EU, show that the cyclic subspace of f associated with v is a subspace of U. 
Exercise 4.36. Show that the endomorphism f : R? > R? corresponding to 
the matrix | - : is not diagonalizable and find the Jordan canonical form 
of f and a Jordan basis. 


Exercise 4.37. Use Exercise 4.36 to solve the system 


vu’ = 3x4 4y 
| ae en a 


Exercise 4.38. Let f : R® — R® be the endomorphism defined by 


x 1 1 0 x 
f y =}-5 4 1 y 
z I 422° z 


Determine the Jordan canonical form of f and a Jordan basis. 


Exercise 4.39. Find an endomorphism f : C? + C? with the Jordan canonical 


form : ’ and a Jordan basis a Ate 3 
0 2 —3 a 


Exercise 4.40. Let f : R® — R® be the endomorphism defined by 


x —3 43 —-17 x 
f y =| -4 29 —-10 y 
Zz —8 60 —21 z 


Using Exercises 4.15 and 4.16 find the Jordan canonical form of f and a Jordan 
basis. 


Exercise 4.41. Let V be a finite dimensional vector space and let f :V— V 
be an endomorphism. If m,(t) = t” for some integer n > 1, show that {0} ¢ 
ker f +--+» C kerf"! € ker f” = V. 


Exercise 4.42. Let V be a finite dimensional vector space and let f : V + V be 
an endomorphism. If m > 2 and S; is asubspace of V such that ker f”~!NS», = 
{0}, show that f(Sm)M ker f”™~? = {O}. 


Exercise 4.43. Let V be a finite dimensional vector space and let f : V - V be 
an endomorphism. If m > 2 and Sp is asubspace of V such that ker f”~!NS», = 
{0}, show that the function g : S, + f(Sm) defined by g(v) = f(v) is an 
isomorphism. 


4.4, EXERCISES 295 


Exercise 4.44. Let V be a finite dimensional vector space and let f:V— V 
be an endomorphism such that m;(t) = t” for some integer n > 1. Using 
Exercise 4.42, show that for every m € {1,...,n} there is a subspace S,, of V 
such that ker f™ = ker f"-! @ Sp for m € {1,...,n} and f(Sm) C Sm—1 for 
m € {2,...,n}. 


Exercise 4.45. Give an example of a vector space V and an endomorphism 
f:V—V such that cp = (t — 2)4(t — 5)* and my = (t — 2)?(t — 5). 


Exercise 4.46. Let V and W be finite dimensional vector spaces, f : V > V 
and g : W — W endomorphisms, v € V and w € W. If mpyv = mMg,w, Show 
that there is an isomorphism h : Vey + Wy.w such that g(x) = h(f(h71(x))) 
for every xX € Wyw. 


Exercise 4.47. Let V and W be finite dimensional vector spaces and let f : 
YV— Vand g:W—-— W be endomorphisms. If f and g have the same Jordan 
canonical form, show that there is an isomorphism h : V — W such that g = 
Afar’ 


Exercise 4.48. Let V be a finite dimensional vector space and let f :V— V 
be an endomorphism. If c¢(t) = (a — t)” for some a € K and integer n > 1, 
show that there is an endomorphism g : V — V such that c,/(t) = (—t)” and we 
have f =g+ ald. 


Exercise 4.49. Let V be a vector space such that dim V = 8 and let f: VV 
be an endomorphism such that m f(t) = (t — a) for some a € K. If 


dim ker(f —a)* = 7, dim ker(f—a)* = 6, dim ker(f —a)? = 4, dim ker(f—a) = 2, 
find the Jordan canonical form of f and a Jordan basis. 


Exercise 4.50. Let V be a finite dimensional vector space and let f :V— V 
be an endomorphism. We assume that mf = t* and that the Jordan canonical 
form of f has 7 Jordan blocks Jo4, 4 Jordan blocks Jo3, one block Jo,2, and 
one Jordan block Jo,1. Determine dim ker f?, dim ker f?, and dim ker f. 


Exercise 4.51. With the notation from Exercise 4.44 show that dim S,, is the 
number of Jordan blocks Jo., of the Jordan canonical form of f and that for 
every m € {1,...,n —1} the dimension dim7,,, equals the number of Jordan 
blocks Jo.m of the Jordan canonical form of f. 


Exercise 4.52. Let V be a finite dimensional vector space and let f :V — V 
be an endomorphism. If cr = (t — a)4(t — 8)? and my = (t — a)3(t — B)? for 
some a, 2 € K, find all possible Jordan canonical forms of f. 


Exercise 4.53. Let V be a finite dimensional vector space and let f,g:V¥—- V 
be endomorphisms. If g is invertible, show that f and gfg~! have the same 
Jordan canonical form. 


Exercise 4.54. Let V be a vector space such that dim V = 4 and let f: VV 
be an endomorphism such that m;(t) = (¢— a)® for some a € K. Determine 
the Jordan canonical form of f and explain the construction of a Jordan basis. 


296 Chapter 4: Reduction of Endomorphisms 


Exercise 4.55. Let V be a vector space such that dim V = 4 and let f: V- V 
be an endomorphism such that cf(t) = (a — t)* and mf(t) = (t — a)? for some 
a € K. If dimker(f — a) = 2, determine the Jordan canonical form of f and 
explain the construction of a Jordan basis. 


Exercise 4.56. Let V be a finite dimensional vector space and let f :V— V 
be an endomorphism such that the Jordan canonical form of f is 


oocococ°o 


0 
0 
0 
0 
1 


oooococReng re 
ooaooe FOO 
OS: O'1:O-Ore: O..oye 
RQ rFeoaqncadacado 


a l 
0 a 
0 O 
0 O 
0 O 
0 O 
0 O 
0 O 


for some a € K. Determine mr. 


Exercise 4.57. Let V be a finite dimensional vector space and let f :V— V 
be an endomorphism such that the Jordan canonical form of f is 


oo 
QS RPloo 
(en i cm) en aera) 


o@M ee 


fe: 


OaBaraocoaoonec 


ooooe 
ooo 


ooocooqooaoe 
oocooqooe 


oO 

oO 

Oo 

oO 
RBoOoOoadgcoaoo 


for some a, 3 € K. Determine my. 


Exercise 4.58. Let V be a vector space such that dim V = 7 and let f: V- V 
be an endomorphism such that ms = (t — a)* for some a € K. Determine all 
possible Jordan canonical forms of f. 


Exercise 4.59. Let V be a vector space such that dimV = 25 and let f : 
Vv + V be an endomorphism such that m f(t) = (t — a)? for some a € K. If 
dim ker(f — a)? = 21 and dimker(f — a) = 12, determine the number of Jordan 
blocks of the form A : in the Jordan canonical form of f. 

Exercise 4.60. Let V be a finite dimensional vector space and let f :V— V 
be an endomorphism. If cr = (t — a)4(t — 8)* and my = (t — a)?(t — B)? for 
some a, 2 € K, determine all possible Jordan canonical forms of f. 


Exercise 4.61. Let V be a finite dimensional vector space and let f :V— V 
be a diagonalizable endomorphism. Assume v,...V, is a basis of Y consisting 


4.4, EXERCISES 297 


of eigenvectors of f corresponding to eigenvalues \1,...,An. If A1,...,Ax are 
the nonzero eigenvalues of f, show that {vi,...,vx} is a basis of ran f and 
{Vi41,--+;Vn} is a basis of ker f. 


Exercise 4.62. Let V be a finite dimensional vector space and let f :V— V 
be an endomorphism. If dim ker(f — a)® = 33 and dimker(f — a)’ = 27 for 
some a € K, determine the possible number of 8 x 8 Jordan blocks in the Jordan 
canonical form of f. 


Exercise 4.63. Let V be a finite dimensional vector space, f : VY — V an 
endomorphism, and a € K. Show that it is not possible to have dimY = 8, 
m(t) = (t— a)" and dimker(f — a) = 1. 


4.4.3. Rational form 


Exercise 4.64. Let V be a finite dimensional vector space, f : V — V an 
endomorphism, and v € V. If my(t) = (t?—t+1)° and that v ¢ ker(f?— f+1)°, 
what can my y be? 


Exercise 4.65. Let V be a finite dimensional vector space, v,w € VY, and 
f:Y¥— V an endomorphism. If GCD(m;v,my,w) = 1, show that mpviw = 
Mf vIM fw: 


Exercise 4.66. Let f : R® — R® be the endomorphism defined by 


x —3 43 —-17 x 
z —8 60 —21 z 


Find a vector v € R? such that myy = (¢ + 1)(t — 3). 
Exercise 4.67. Let f : R® — R® be the endomorphism defined by 


e —3 43 —-17 x 
f y =| -4 29 —-10 y 
Zz —8 60 —21 z 


Find a vector v € R? such that the set B = {v, f(v), f?(v)} is a basis of R? and 
determine the B-matrix of f. 


Exercise 4.68. Let V be a finite dimensional vector space and let f :V— V 
be an endomorphism. If the invariant factors of f are (t — 2)3(¢ — 5)? and 
(t — 2)(t — 5)?, determine the Jordan canonical form of f 


Exercise 4.69. Let V be a finite dimensional vector space and let f :V— V 
be an endomorphism. If the minimal polynomials of the Jordan blocks of the 
Jordan canonical form of f are 


(6 =1)° -¢ S1)? f= 1, —4)9, = 47, G27, = 277 = 2/77, = 9), 


determine the invariant factors of f. 


298 Chapter 4: Reduction of Endomorphisms 


Exercise 4.70. Let V be a vector space such that dim V = 8 and let f: VV 
be an endomorphism. If m+ = (¢?+1)(t?+t+1), determine all possible invariant 
factors of f. 


Exercise 4.71. Let V be a finite-dimensional vector space, f : V > V an 
endomorphism, and 6 a basis of V. If the B-matrix of f is 


eae oe i) 


1 0) Sacer To 
Os S30 604) 
Or OO aL 
OO 0" E, -=8 


show that this matrix is the rational form of f and determine the Jordan canon- 
ical form of f. 


Exercise 4.72. Find a vector space Y and an endomorphism f : V > VY such 
that cp = (t? +¢4+1)3(t2 +1) and my = (t? +¢4+1)(? +1). 


Exercise 4.73. Let V be a finite-dimensional vector space, f : V > V an 
endomorphism, and B a basis of V. If the B-matrix of f is 


ie, Qty 105 08: 
it {5 
O: 208° 208 Ss hx, 
OF TPO 
O- 0 OO 6 


determine the rational form of f. 


Exercise 4.74. Let V be a vector space such that dim V = 8 and let f: VV 
be an endomorphism. If the Jordan canonical form of f is 


a 1000 
0a 10 0 
00a0 0], 
00061 
000 0 B 


where a # (£, determine the rational form of f . 


Exercise 4.75. Let V be a finite-dimensional vector space, f : V — V an 
endomorphism, and 6 a basis of V. If the B-matrix of f is 


0000 -1 
1000 -5 
G. 0s GS 210) « 
G20 “Pio. =o 
O70 OF 5 


determine the minimal polynomial of f. 


4.4, EXERCISES 299 


Exercise 4.76. Let V be a finite-dimensional vector space, f : V — V an 
endomorphism, and 6 a basis of V. If the B-matrix of f is 


0 -3 0 0 
1 0 0 0 
0 0 O -1 ]’ 
0 0 1 0 


determine the rational form of f. 


Exercise 4.77. Determine a vector space Y and an endomorphism f : V > VY 
such that my = 3 —t — 4t? + 203 + ¢4. 


This page intentionally left blank 


Chapter 5 


Appendices 


Appendix A Permutations 
By a permutation on {1,2,...,n} we mean a bijection 
a: {1,2,...,n} > {1,2,...,n}. 


The set of all permutations on {1,2,...,n} will be denoted by G,,. Note that 
the identity function id: {1,2,...,n}— {1,2,...,n} is an element of G,. 

A permutation o € G,, defines a permutation on any set with n objects. 
For example, if o € G,, and (x1,...,Xn) is an ordered n-tuple of vectors, then 
(X5(1);+++1Xo(n)) is the corresponding permutation of (x1,...,Xn). 

If we write the first n integers > 1 in the order 1,2,...,n, then we may think 
of a permutation o € G, as being a reordering of these numbers such that we 
get o(1),0(2),...,0(n). A permutation o € G,, will be called a transposition if 
there are distinct j,k € {1,2,...,n} such that 


o(j) =k, o(k) =j, and o(m) =m for every m € {1,2,...,n} 
different from j and k. 


In other words, a transposition switches two numbers and leaves the remaining 
numbers where they were. We will use the symbol oj, to denote the above 
transposition. Note that OiR =ojx for every j,k € {1,2,...,n}. 

A transposition of the form oj,;41 is called an elementary transposition. 


Theorem 5.1. Every permutation in 6, with n > 2 is an elementary 


transposition or a composition of elementary transpositions. 


Proof. We will prove this result using induction on n. Since every permutation 
in o € G2 is an elementary transposition, the statement is true for n = 2. Now 
assume that the statement is true for every k <n for some n > 2. 


301 


302 Chapter 5: Appendices 


Let o : {1,2,...,n} — {1,2,...,n}. be a permutation. If o(n) = 7 for some 
j <n, then the permutation 


T = Onn—-1+++Fj41,5F 


satisfies 7(n) = n and we have 


=-art -1 Ss Fin of 
O07 Gesu nil = Opt gs Onn 1T- 


This means that, if 7 is a product of elementary permutations, then o is also a 
product of elementary permutations. 


Now, if a(n) = n, then the restriction of o to the set {1,2,...,n — 1} 
is a product of elementary permutations on {1,2,...,2— 1}, by the inductive 
assumption. Because every elementary permutation 7 on {1,2,...,n—1} can be 
extended to an elementary permutation 7’ on {1,2,...,n} by taking 7/(j) = j 
for 7 <n and z7'(n) =n, o is a product of elementary permutations. oO 


Definition 5.2. Let o € G,. A pair of integers (j,k), 7 < k is called an 
inversion of a if o(j) > o(k). A permutation with an even number of 
inversions is called an even permutation and a permutation with an odd 
number of inversions is called an odd permutation. We define the sign 


of a permutation o, denoted by e(c), to be 1 if o is even and —1 if o is 
odd, that is, 


if o is even 
if o is odd 


Example 5.3. Show that every transposition is an odd permutation. 


Proof. Let oj, be a transposition where 7 < k. We represent this transposition 
by 


(Ree) ol eg celle cme leeg ictal ee 12)) 
The inversions are 
(ef hp yj de cee ee (int 


The total number of inversions is the odd number (2(k — j) — 1). Oo 


APPENDIX A. PERMUTATIONS 303 


Theorem 5.4. The sign of a product of permutations is the product of 
the signs of these permutations, that is, 


€(01°++OK) = €(01) ++ €(K), 


for any o1 


Proof. Let o € Gy and let T = 00541,;. 

First note that, since (r(j), 7(j+1)) = (o(j+1), o(y)), the pair (r(j), T(f+1)) 
is an inversion if and only if (o(j),a¢(j + 1)) is not an inversion. Moreover, 
(r(p),7(@)) = (o(P)), o(9)) fp AJ,pAIJ+1,¢#j andq#j +1. Since 


{(7(1),7G)),--- (7G - DG) = {o), 6G +)... (@G- 1), eG +1), 


{(7(1), 7G +D),---. (7G - DY, 7G +1))} = ((o), 0G), (eG -— YD), eo), 
{(7(9), 7G +2), ++ (7H), TM) = COG +1), 6G +2)),--- (FG +1), o(m))}, 
{(7G +1), 7G +2)),---(7G +1), 7(m))} = (09), 0G + 2)),-- (0), 0(m)) 


{(r(9), 719 +2)), +++ (7G), 7(7)), (TG +1), 7G +2)),--- (TG +1), T(m))f = 


{(0(9),0(9 + 2)),--+,(0(9),o(m)), (09 + 1), 0G + 2)),---. (0G +1), o(n))f- 


Hence e(7) = —e(a), because (7(7), 7 + 1)) = (o(9 + 1)), 09). 
Since every permutation is a product of elementary transpositions, the result 
follows by induction. O 


Proof. Since 


we have 


304 Chapter 5: Appendices 


Appendix B Complex numbers 


The set of complex numbers, denoted by C, can be identified with the set R? 
where we define addition as in the vector space R?, that is 


(a,b) adr (c,d) = (a-+¢,0- >), 
and the multiplication by 
(a, b)(c, d) = (ac — bd, ad + be). 


If z = (x,y), then by —z we denote the number (—2, —y). Note that z+(—z) = 
(0,0). 

If z = (a, y) 4 (0,0), then by z~! or + we denote the number (wtp. aziz). 
Note that zz—! = z~1z = (1,0). 


Theorem 5.6. If z, 21, z2, 23 € C, then 
(a) (21 + 22) +23 = 244+ (22 + 23); 
2+ 22 = 22 + 21; 


(2122) 23 => 21(Z223); 


(g) (41 + 22)23 = 2123 + 2023. 


All these properties are easy to verify. As an example, we verify the propriety 
(c). If 21 = (a,b), 22 = (c,d), and z3 = (e, f), then 


(2122)23 = (ac—bd, ad+bc)(e, f) = ((ac—bd)e—(ad+bc) f, (ac—bd) f +(ad+bc)e) 
and 
21(z923) = (a, b)(ce—df, cf +de) = (a(ce—df)—b(cf +de), a(cf +de)+b(ce—df). 


Since 

(ac — bd)e — (ad + bc) f = a(ce — df) — b(cf + de) 
and 

(ac — bd) f + (ad + bc)e = a(cf + de) + b(ce — df), 


we get (2122)23 = 21(2223). 
The function y : R > {(x,0) : « € R}, defined by v(x) = (2,0), is bijective 
and satisfies 


APPENDIX B. COMPLEX NUMBERS 305 


(a) p(x +y) = 9(x) + 9(z); 
(b) y(ry) = (x) y(z). 


This observation makes it possible to identify the real number x with the com- 
plex number (2,0). It is easy to verify that 


(0,1)(0, 1) = —1. 
The complex (0,1) is denoted by i. Consequently we can write 
?=-l. 
If z = (x,y) is a complex number, then 
z= (x,y) = (#,0) + (y,0)(0,1) =a + yi, 


which is the standard notation for complex numbers. In this notation the prod- 
uct of the complex numbers (a,b) and (c,d) becomes 


(a, b)(c, d) = (a + bt)(c + di) = (ab — bc) + (ad + bc)i. 


A complex number z is called purely imaginary if there is a nonzero real 
number y such that z = (0,y) = yi. The complex conjugate of a number 
z=a-+ yi is the number x — yi and it is denoted by Z. 


Theorem 5.7. If z, 21,22 € C, then 


(a) ZS? 


(b) 4 +22 =%4+%; 


(c) 2122 = 2] 22. 


Proof. Clearly we have 7 = z and 21 + 22 = 21+ 2%. To prove 2122 = %] 2 we 
let 2] = 21 + yyt and z2 = 22 + yoi. Now 
Zzq = (x1 + yrt)(x2 + yot) 
= 2122 — yiya + (Liy2 + yive)i 


= 2122 — yiy2 — (Tiy2 + yiva)i 


and 


Zy Z=X1 + yt La + yot 
= (21 — yit)(t2 — yot) 


= 2102 — yryo — (iyo + yiX2)i. 


306 Chapter 5: Appendices 


Note that a complex number z is real if and only if z = 7 and a complex 
number z is purely imaginary if and only if z = —7Z. 
The absolute value of the number z = x + yi is the nonnegative number 


\/ x? + y? and is denoted by |z]. 


Theorem 5.8. /f z, 21,22 € C, then 


(a) z 40 af and only if |z| 40; 
( 


b 


22 — aa 


(d) |z1 + 2a| < [21| + |zal; 


2 
) 

() 21 = 5; 
) 
) 


|z122| = |z1| |zal- 


Proof. Parts (a), (b), and (c) are direct consequences of the definitions. 
To prove |z1 + 22| < |z1| + |z2| we let 21 = a1 + yri and z = x + yo. 
Since both numbers |z1 + z2| and |z1| + |z2| are nonnegative, the inequality 
|21 + z2| < |z1| + |z2| is equivalent to the inequality |z1 + z2|? < (|z1| + |z2l)? 
which is equivalent the inequality 


whe + yrye < 4/02 + y}s/ a8 + 2. 


The above inequality is a direct consequence of the inequality 
(1x2 + yry2)? < (a7 + yi )(73 + y3) 
which is equivalent to the inequality 
Qe woyry2 < xiy> + Yir3 


which can be written as 
2 


0 < (a1y2 — yi 22)°. 
Now to prove (e) we observe that 
|z122|? = 21222122 = 2 222122 = 221 2222 = |z1|?|z2l?, 


which gives us |2z122| = |21| |z2|- Oo 


Corollary 5.9. For every complex number z #0, there are unique real 


number r >0 and a complex number v such that |v| = 1 and z= rv. 


Proof. We can take r = |z| and v = 


Tq because [Z| = |2|- 
If z = sw where s > 0 and |w| = 1, then |z| = |s||w| = s, by Theorem 5.8, 


and thus w = a | 


APPENDIX C. POLYNOMIALS 307 


Appendix C Polynomials 
A polynomial is a function p : K > K defined by 
p(t) = Ant” + a a +--+++ ait + ao 


where n is a nonnegative integer and ao,...,@, € K. We denote the set of all 
polynomials by P(K). 
If p,q € P(K) and a € K, then we define polynomials p + q, pg, and ap by 


(p+ q)(t) = p(t)+ a(t), (pa)(t) = p(t)q(t), and (ap)(t) = ap(t). 


The numbers ao,...,@p, are uniquely defined by f. To prove this important 
fact we first prove the following result which is a form of the Euclidean algorithm. 


Theorem 5.10. Let p(t) = ant” +--+ + a9 and s(t) = bat” +--+ + bo 
be polynomials such that n > m > 0 and by, #4 0. Then there are 
polynomials q(t) = Cn—-mt”—™ +--+ +c9 and r(t) = dyt*® +---+do such 


thatm>k>0 and 
p=aqst+r. 


Proof. It is easy to verify that 


pi(t) = p(t) — re ma(t) =a) jf ++. +a) 
m 
for some ai,_j,...,a9 € K. 
Ifn—1<™m, then 
a 
p(t) = =e" s(t) + pr(t) 
m 


and we can take q(t) = 72t"~™ and r = py. 

If n — 1 > m, then we can show by induction that there are polynomials 
qi(t) = Cy mit” +++ +5 and ri (t) = dit? +---+dg such that m > j > 0 
and 

Pr=Uust+ri. 


Consequently, 


pit) = PAE 8) + (Holt) + rut) = (fe i ald) s(t) +ri(2) 


and we can take q(t) = t"~™anb;) + q(t) and r =r. oO 


Definition 5.11. By a root of a nonzero polynomial p € P(IK) we mean 


any number a € K such that p(a) = 0. 


308 Chapter 5: Appendices 


Theorem 5.12. Ifa € K is a root of a polynomial p(t) = ant" +---+a, 
then there is a polynomial q(t) = cn_1t"~ + + +++ + 9 such that 


p(t) = (¢— a)aq(t). 


Proof. According to Theorem 5.10 there are a polynomial q(t) = cn—it”~! + 
--»-+c¢g and a number r € K such that 


p(t) = (t— a)g(t) + r(t). 
Since r(a) = 0, we get p(t) = (t — a)q(t). oO 


Theorem 5.13. If a polynomial p(t) = ant? +---+a9 has n+1 distinct 


roots, then p(t) =0 for every t € K. 


Proof. Let ay,...,Q@n+41 be distinct roots of the polynomial p. From Theorem 
5.12 we obtain, by induction, that there is a c € K such that 


p(t) = c(t — ay)--- (€-— an). 


Since 
P(Qn+1) = C(Qn+1 — @1) +++ (Ant1 — An) = 0, 


we get c= 0 and consequently p = 0. O 


Corollary 5.14. If p © P(K) and p(t) =0 for allt € K, then p=0. 


Now we are ready to prove the important result mentioned at the beginning, 
that is, the uniqueness of numbers a,,...,a9 for a polynomial p(t) = a,t” + 
-++ tag. 


Theorem 5.15. Let p(t) = ant” +---+4ao and q(t) = bmt™ +---+ bo. 
If p(t) = q(t) for all t € K, then n = m and a; = b; for every j € 


re 


Proof. Since (p — q)(t) = 0 for all t € K, the result follows from Corollary 
5.14. O 


If p(t) = ant”+---+ao, then the numbers ao,..., dy are called the coefficients 
of p. If p is a nonzero polynomial and a, 4 0, then a, is called the leading 
coefficient of the polynomial p and n is called the degree of p, denoted by deg p. 
We do not define the degree of the zero polynomial, that is, the polynomial 


APPENDIX C. POLYNOMIALS 309 


defined by p(t) = 0 for every t € K. Note that the degree of a polynomial is 
well defined because of the Theorem 5.15. 

Let n be a nonnegative integer.The set of all polynomials of the form ap + 
ajt+---+a,t” is denoted by P,, (IK). Since Po(K) can be identified with K, we 
often do not distinguish between a constant polynomial a and the number a. 

The following theorem is an immediate consequence of the definition of the 
degree of a polynomial. 


Theorem 5.16. If p and q are nonzero polynomials, then 


deg(p + q) < max{degp,degq} and deg(pq) = deg p + deg q. 


Note that deg(ap) = deg p for any nonzero polynomial p and any nonzero 
number a. 
Now we can formulate the final form of Theorem 5.10. 


Theorem 5.17. If p,s € P(K) and s is a nonzero polynomial, then 
there are unique q,r € P(K) such that degr < degg or r = 0 and we 
have 


p=aqst+r. 


Proof. The existence of g,r € P(K) can be obtained by a slight modification of 
the proof of Theorem 5.10. 
Suppose that we have 


p=qstr and p=qs+n, 


for some q,7,q@1,71 € P(K) such that degr < deg s or r = 0 and degri < deg s 
or r; = 0. Then 

0=(q-q)s=r-T. 
Ifq—q #4 0, then (q — qi)s 4 0, because deg(q — qi) + deg s = deg(q — q1)s. 
Consequently, g— qi = 0 and r = 74. O 


The polynomial g is called the quotient on division of the polynomial p by 
the polynomial s and the polynomial r is the remainder on this division. 

Probably the student is familiar with the long division algorithm. We do 
not use it in the next example. 


Example 5.18. Let p(t) = 4t? + 5t? + 15t +8 and s(t) = t?+t+3. Find the 
quotient and the remainder on division of p by s. 


Proof. In this case, it is easy to see that q(t) = at + b and r(t) = ct +d. We 


310 Chapter 5: Appendices 


have 


At? +5¢7+15¢+8 = (¢?+¢+3)(at+b)+ct-+d = at®+(a+b)t?+(3a+b+c)t+3b+d 
which yields 
4=a; 
5=at+b; 
15 =3a+b4+c 
8 = 3b+d. 
This gives q(t) = 4¢+ 11 and r(t) = 2¢+ 5. Oo 


A polynomial p is called monic if its leading coefficient is 1. 
If p and q are polynomials such that there is a polynomial s satisfying p = qs, 


then we say that q divides p or that q is a divisor of p. 


Theorem 5.19. Let pi,...,Pn be nonzero polynomials. There is a 
unique monic polynomial d such that 


{sipi t:++++SnPn : $1,---,5n € P(K)} = {sd: s € P(K)}. 


The polynomial d divides all polynomials py,...,pn. Moreover, if a 
nonzero polynomial h divides all polynomials p,,...,Dn, then h divides 
d. 


Proof. Let d be a nonzero polynomial of the smallest degree in the set 
F = {s1p1 +--+ 5nDn: $1,.--,5n € P(K)}. 


Then d divides f; for every j € {1,...,n}. Indeed, by Theorem 5.17, for every 
j € {1,...,n} we have p; = qjd+-7,, for some polynomials q; and r;, and 
thus r; € F. If r; # 0 for some j € {1,...,n}, then we get a contradiction 
because deg r; < degd and d is a nonzero polynomial of the smallest degree in 


F. Consequently we must have r; = 0 for every j € {1,...,n} which means 
that for every j € {1,...,n} the polynomial d is a divisor of the polynomial p;. 
Since 


Sip1 + +++ + 8nPn = (8191 + +++ + 8ngn)d, 


we have F C {sd: 5s €¢ P(K)}. On the other hand, because d € F there are 
polynomials hi,...,h, such that 


d= hypi +--+ +PhnPn (5.1) 


and consequently {sd: s € P(IK)} C F. 


APPENDIX C. POLYNOMIALS 311 


Now, if d, and dz are monic polynomials such that 
{sd :s € P(K)} = {sdo: 5 €P(K)}, 


then there are polynomials s; and s2 such that d, = s,dz and dz = s2d,. Then 
dy = 8 Sod, and thus s,sg = 1. Since d; and d2 are monic polynomials, we 
conclude that d, = dz. 

Finally, if a nonzero polynomial h divides all polynomials p;,...,pn, then h 
divides d as a consequence of (5.1). Oo 


As a consequence of Theorem 5.19 and its proof we get the following result. 


Theorem 5.20. Let pi,...,Pn be nonzero polynomials. There is a 
unique monic polynomial d such that d divides all polynomials pi,...,Pn 
and if a nonzero polynomial h divides all polynomials pi,..., pn, then h 
divides d. 


Definition 5.21. Let pi,...,pn be nonzero polynomials. The unique 
monic polynomial d from Theorem 5.20 is called the greatest common 
divisor of the polynomials pj,...,p, and is denoted by 


GCD(pi,..., Dn). 


Note that if 
GCD(p1,...,pn) = 1, 


then there are 51,..., 5) € P(K) such that 
Sipit-+++ Snp~n = 1. 


This observation is often used in proofs. 


Definition 5.22. A nonzero polynomial p € P(K) with degp > 1 is 
called irreducible in P(K) if p = fg and f,g € P(K) implies f € K or 


g€K. 


Note that irreducibility of a polynomial may depend on K. For example, 
the polynomial ¢? + 1 is irreducible in P(R) but it is not irreducible in P(C). 
Clearly, if a polynomial in P(R) is irreducible in P(C) then it is irreducible in 
P(R). 


312 Chapter 5: Appendices 


Example 5.23. Show that the polynomial at+0 is irreducible for any a,b € K 
with a £0. 


Solution. If at + b = f(t)g(t), then deg f + degg = 1 and thus deg f = 0 or 
deg g = 0. Consequently, f € K or g € K. O 


Example 5.24. Show that the polynomial of degree at least 2 which has a 
root in K is not irreducible. 


Solution. This is an immediate consequence of Theorem 5.12. O 


Example 5.25. Let a,b,c € R with a 4 0. Show that the polynomial at? + 
bt + ¢ is irreducible over R if and only if 4ac — b? > 0. 


Solution. If 4ac — b? > 0, then 


: ee, eee b\? 4ac—b? 
at“+bt+c=a UOTE A er re =a pee ; 


Since 


b\? 4dac— bv? 
t+—} +—{~— >0, 
2a 


the polynomial at? + bt + ¢ has no real root and consequently is irreducible. 
If 4ac — b? < 0, then 
a oe = b? — dac 
2a) ~— Aa? 


and the polynomial at? + bt + c has the well known roots 


b Vb? — 4ac 


~ 9a 7 2a 


and thus is not irreducible. O 


Example 5.26. Let a,b,c,d € R witha ¢ 0. If the polynomial at? + bt? +ct+d 
has a complex root x+yi such that y 4 0, show that the polynomial (t—x)?+y? 
divides at? + bt? + ct +d. 


APPENDIX C. POLYNOMIALS 313 


Solution. If at? + bt? + ct +d has a root x + yi such that y 4 0, then 
a(x — yi)? + b(a — yi)? + e(a — yi) +d = a(x + yi)? + b(a + yi)? + c(a + yi) +d =0. 


Note that (t—a—yi)(t—x+yi) = (t—2)?+y? is an irreducible polynomial 
in P(R). Since we can write 


at? + bt? +. ct +d =((t-— 2)? +y)qtet+ f, 


where e and f are real numbers, et + f as a polynomial from P(C) has two 
different roots x + yi and x — yi. Consequently e = f = 0, by Theorem 5.13. 
This shows that the polynomial (t — x)? + y? divides at? + bt? + ct + d. Oo 


Theorem 5.27. If an irreducible polynomial p divides the product of 


two polynomials fg then p divides f or p divides g. 


Proof. Suppose p does not divide f. If d is a polynomial such that p = dp, and 
f = df; for some pi, fi € P(K), then d € K. Consequently, GCD(p, f) = 1 and 
there are polynomials q and h such that 


qp+hf =1. 
Hence 
gpg +hfg=g. 
Because p divides gpg and hfg, it must divide g. O 


Lemma 5.28. If an irreducible polynomial p divides the product q1--- dm 
of irreducible polynomials qi,...,dm, then there are j € {1,...,m} and 


c€ K such that qj = cp. 


Proof. Using Theorem 5.27 and induction we can show there is a 7 € 1,...,m 
such that p divides q;. Because the polynomial q; is an irreducible polynomial, 
there is a number c € K such that q; = cp. oO 


Theorem 5.29. If 
Pie Pn = G1 °°" Im 


for some irreducible polynomials pi,.-.-,Pn,Qi,-+-;dm; thenn =m and 
there are numbers c1,...,Cn such q, = C1P1,---;Gn = CnPn- 


314 Chapter 5: Appendices 


Proof. Without loss of generality, we can assume that q; = c,p; where c; € K, 
by Lemma 5.28. This gives us 


P2---Pn = €1q2---dm- 


Because the polynomial c;q2 is irreducible can we finish the proof by induction. 
oO 


Theorem 5.30. Every nonzero polynomial can be written as a product 


of irreducible polynomials. 


Proof. Let p be a nonzero polynomial such that degp > 0. If p is irreducible 
we are done. If not, then we can write p = fg where deg f > 0 and degg > 0. 
Now we continue in the same way with the polynomials f and g and finish the 
proof by induction. oO 


Because for every irreducible polynomial p there is a unique monic irreducible 
polynomial q such that p = cq, where c € K, every polynomial p such that 
deg p > 0 can be uniquely written as 


p= egy +--q,"* (5.2) 


where c € K and q,...,@n are irreducible monic polynomials. 
As a consequence of Theorems 5.29 and 5.30 we get the following result. 


Theorem 5.31. Let pi,...,Pn be nonzero polynomials. There is a 
unique monic polynomial m such that all polynomials p,,...,Dn divide 
m and if all polynomials py,...,pn, divide a nonzero polynomial h then 
m divides h. 


Definition 5.32. Let pi,...,pn be nonzero polynomials. The unique 
monic polynomial m from Theorem 5.31 is called the lowest common 
multiple of the polynomials p1,...,p, and is denoted by 


LCM(p1,..., Pn). 


The following theorem is called the Fundamental Theorem of Algebra. It is 
usually proven using complex analysis. 


Theorem 5.33. For every p € P(C) such that degp > 0 there is z EC 


such that p(z) = 0. 


APPENDIX C. POLYNOMIALS 315 


From the Fundamental Theorem of Algebra and Theorem 5.12 we obtain 
the following important result. 


Theorem 5.34. If p © P(C) has exactly k distinct roots a1,...,Qk; 


p(t) = c(t 7 a)™ ee (t ae ap) 


for some cE C and m,..., mx are integers > 1. 


Theorem 5.35. If p € P(R) is irreducible and deg p > 2, then 
p(t) = at? + bt +e 


where a,b,c € R, a#0, and 4ac — b? > 0. 


Proof. If p has a real root, then p is not irreducible. 

If deg p > 3, then p has a complex root of the form x + yi with y 4 0 and 
we can show, as in Example 5.26, that the polynomial (t — x)? + y? divides p 
and consequently p is not irreducible. 

To complete the proof we note that a polynomial p(t) = at? + bt + c with 
a,b,c € R, a £0, and 4ac — b? > 0, is irreducible. O 


Now we can state a version of Theorem 5.34 for P(R). 


Theorem 5.36. If p © P(R), then 
p= cg? -+-qe, 


where c € R and for every j € {1,...,k} the polynomial q; is either of 
the form t—a for some a € R or of the form t?+ht+y for some B,y ER 
such that B? — 4y <0. 


Lagrange interpolation theorem 


Theorem 5.37. For any integern > 1 and ay,...,Qn,(1,---,8n € K, 
such that a1,...,Qn are distinct, there is a polynomial p € P(K) such 
that 


p(oi) = fi,..-, p(n) = Bn. 


316 Chapter 5: Appendices 


Proof. For every j € {1,...,n} the polynomial 


(t — a1) +++ (t= aj—-1)(t — a541) +++ (t= On) 


q(t) = (aj — a1) +++ (aj — a3-1) (5 — 541) - + (Aj — On) 


satisfies q;(a;) = 1 and q;(ax) = 0 for all k # j. Thus we can take 


p=hin ees + Bndn- 


The formal derivative of a polynomial 


Definition 5.38. By the derivative of a polynomial p(t) = a,t”+---+ao 
we mean the polynomial 


Nant”! +--+ + 2act + ay. 


The derivative of a polynomial p is denoted by p’, that is, if p(t) = ant” + 
-» +a then p'(t) = nant”—! +--+ + 2agt + ay. 


Theorem 5.39. For any p,q € P(K) anda € K we have 


(b) (pt+q) =p'+q’; 


(c) (pq)’ = p'a + pd’. 


Proof. Clearly, we have (ap)’ = ap’ and (p+ q)' = p’ +q'. To show that 
(pq)' = p'¢ + pd’ we first note that 


ay = (m 4 ae _ mt'™ le 4 nt™tr 1} = (ye +e (ery, 


for any positive integers m and n. To finish the proof we use the fact that, if 
D1, p2,q € P(K) are polynomials such that (piq)’ = piqt+ pid’ and (peq)’ = 
pod + pad’, then 
((p1 + p2)q)’ = (p1q)' + (p2q)' = pig + pid’ + pbq + pad’ 
= (p + ph)a + (pi + po)q’ = (pr + p2)'a + (pi + v2) dq’. 


O 


Using Theorem 5.39 and mathematical induction we obtain the following 
useful result. 


APPENDIX D. INFINITE DIMENSIONAL SPACES 317 


Corollary 5.40. For any p € P(K) and any integer n > 1 we have 


Appendix D_ Infinite dimensional inner product 
spaces 


Many results in Chapter 3 are formulated and proved for finite dimensional inner 
product spaces. While some of these results remain true in infinite dimensional 
inner product space, other are not or require additional assumptions. Here we 
briefly address the issues arising in infinite dimensional inner product spaces. 

The important Representation Theorem 3.3.1 is no longer true if we remove 
the assumption that V is finite dimensional. Indeed, consider the space VY of 
all continuous functions on the interval [0,1] with the inner product (f,g) = 
rks f(t)g(t) dt and the function ® : V > K defined by 


B(f) = | * Fadl. 


The function ©® is clearly a linear transformation, but there is no continuous 
function go such that 


5(f) = (F,90) = | F(t)golbdt, 


for every continuous function f, because then we would have to have go(t) = 1 
for all t € (0,4) and go(t) =0 for all t € (4,1). 

The Representation Theorem guarantees that every linear transformation 
between finite dimensional spaces has an adjoint. This is no longer true in 
infinite dimensional inner product spaces. Indeed, consider the space V = C° 
of all infinite sequences of complex numbers with only a finite number of nonzero 


terms with the inner product defined as 
co 
(( tis tea; nee ), (yi, Y2,--- )) = S- LiYj- 
j=l 


Note that because all but a finite number of x;’s and y;’s are 0, the summation 
is always finite and thus we don’t have to worry about convergence of the series. 
Now consider the functions f : C° — C°® defined as 


co 


f((t1,%2,..-)) = Det Dei 


jel 


318 Chapter 5: Appendices 


It is clearly a linear transformation from C° to C*. Now suppose there is 
a linear transformation g : C° — C™ such that (f(x), y) = (x,g(y)) for all 
x,y € C™®. Then for every integer 7 > 1 we would have 


(e;,g9(e1)) = (f(ej),e1) = 1, 


where e; is the sequence that has 1 in the j-th place and zeros everywhere else. 
But this is not possible because this means that g(e1) = (1,1,...) ¢ C™. 

Since there are linear transformations that do not have adjoints, in every 
theorem that says something about the adjoint of a transformation we assume 
that the inner product space is finite dimensional spaces. Many of those the- 
orems remain true in all inner product spaces if we simply assume that the 
transformations have adjoints. For example, if we assume that all transforma- 
tions in Theorem 3.3.4 have adjoints, then the theorem is true for all inner 
product spaces. 

The definition of self-adjoint operators is formulated for operators on finite 
dimensional spaces, but it is not necessary. We can say that a linear transfor- 
mation f : V > V is self-adjoint if (f(x), y) = (x, f(y)) for all x,y € V. Note 
that this definition makes sense in any inner product spaces and many of the 
properties of self-adjoint operators proved in this chapter remain true in infinite 
dimensional inner product spaces and quite often the presented proof does not 
require any changes. 

There are theorems that depend in an essential way on the assumption that 
the space is of finite dimension. For example, in Theorem 3.4.56 we show that 
on a finite dimensional inner product space a linear operator is unitary if and 
only if it is isometric. This is not true in general. Indeed, consider the space 
VY = C™ defined above and the linear operator f : C° — C® defined as 


f(@1, 2, oe .) _ (0,21, 22, oes ). 
Note that this operator has an adjoint: 


f* (x1, %2,...) = (@2,%3,...). 


Since 
If (wa, 225... ||? = lO)? +]ar|? +a]? +--+ = [aa]? +]xal?+--- = |[(e1, 22,-.- II, 
we have || f(x)|| = ||x|| for every x € C™ and thus f is an isometric operator. 


On the other hand, since ran f 4 C® and f f* # Id, f is not a unitary operator. 
It would be an excellent way to review Chapter 3 by checking for which 
theorems the assumption of finite dimensionality is essential. 


Bibliography 


12 


13 


D. Atanasiu and P. Mikusiski, Linear algebra, Core topics for the first 
course, World Scientific, 2020. 


S. Axler, Linear algebra done right, 3rd edition, Springer, 2015. 
S. K. Berberian, Linear algebra, Dover Publications, 2014. 
R. Godement, Cours d’algébre, 3rd edition, Hermann, 1966. 


J.S. Golan, The linear algebra a beginning graduate student ought to know, 
3rd edition, Springer, 2009. 


M. Houimi, Algébre linéaire, algébre bilinéaire, Ellipses, 2021. 


H. J. Kowalsky and G. Michler, Lineare algebra, 12th edition, de Gruyter, 
2003. 


T. W. Korner, Vectors pure and applied, Cambridge University Press, 2013. 
S. Lang, Linear algebra, 3rd edition, Springer, 1987. 


R. Mansuy and R. Mneimné, Algébre linéaire. Réduction des endomor- 
phismes, 2nd edition, Vuibert, 2016. 


L. Spence, A. Insel, and S. Friedberg, Linear algebra, 5th edition, Pearson, 
2018. 


S. Weintraub, A guide to advanced linear algebra, The Mathematical As- 
sociation of America, 2011. 


H. Woerdeman, Advanced linear algebra, Chapman and Hall/CRC, 2015. 


319 


This page intentionally left blank 


Index 


ker f, 73 
(x,y), 119 
Anxn(K), 10 


Unxn(K), 60 
w® f, 106 
w® v, 213 
y",6 
projy(v), 134 
ran f, 74 
f*, 156 

ft, 219 

f? 110 
fase, 88 
ly;, 98 


adjoint transformation, 156 
algebraic multiplicity of an eigenvalue, 


246 


alternating n-linear form, 222 
alternating multilinear form, 222 
annihilator, 143 


basis, 24 


best approximation, 135 


bidual, 100 


bilinear form, 114 


canonical isomorphism, 101 
Cartesian product, 6 


change of coordinates matrix, 54 
characteristic equation, 234 
characteristic polynomial, 234 
companion matrix, 283 
complement of a subspace, 39 
conjugate transpose matrix, 159 
coordinates, 26 

Cramer’s rule, 228 

cyclic subspace, 255 


derivative of a polynomial, 316 
determinant of a matrix, 229 
determinant of an endomorphism, 231 
diagonalizable endomorphism, 246 
dimension, 47 

direct sum, 32 

dual basis, 98 

dual space, 97 


eigenspace, 163, 236 
eigenvalue, 163, 234 
eigenvector, 163, 236 
elementary transposition, 301 
endomorphism, 70 

Euclidean norm, 124 

even permutation, 302 


f-annihilator, 238 
f-invariant, 105, 244 
functional, 97 


geometric multiplicity of an eigenvalue, 
246 

Gram-Schmidt process, 149, 150 

greatest common divisor, 311 


hermitian form, 116 


inner product, 119 


321 


322 


inner product space, 119 
invariant factors, 289 
invariant subspace, 167 
inversion of permutation, 302 
irreducible polynomial, 311 
isometric operator, 181 
isometry, 181 

isomorphic vector spaces, 81 
isomorphism, 81 


Jordan block, 258 
Jordan canonical form, 265, 279 


kernel of a linear transformation, 73 
Kronecker delta, 139 


linear combination, 14 

linear form, 97 

linear independence, 19 
linear span, 14 

linear transformations, 68 
linearly dependent, 18 
lowest common multiple, 314 


matrix of a linear transformation, 88, 
94 

minimal polynomial, 242 

monic polynomial, 310 

multilinear form, 222 


norm, 125 

normal operator, 160 

normed space, 125 

nullity of a linear transformation, 80 


odd permutation, 302 
operator, 70 

orthogonal complement, 143 
orthogonal decomposition, 174 
orthogonal operator, 187 
orthogonal projection, 129, 133 
orthogonal set, 139 

orthogonal vectors, 123 
orthonormal set, 139 


partial isometry, 210 
permutation, 301 
polar decomposition, 211 


INDEX 


polarization identity, 116, 118, 127 
positive definite form, 119 
positive operator, 192 

projection, 76 

projection matrix, 131 


quotient linear transformation, 103 
quotient space, 103 


range, 74 

rank of a linear transformation, 80 
rank-nullity theorem, 79 

rational form, 289 

reducing subspace, 167 
Representation Theorem, 155 

root of a polynomial, 307 


scalar field, 1 

Schwarz’s inequality, 122 
self-adjoint operator, 160, 163 
sesquilinear form, 115 

singular value decomposition, 203 
span, 14 

spanning set, 14 

spectral decomposition, 172 
spectral representation, 174 
square root of positive operator, 196 
subspace, 9 

symmetric form, 115 

symmetric matrix, 177 


trace of an operator, 216 
transposition, 301 
triangle inequality, 125 
trivial vector space, 3 


unit vector, 129 
unitary operator, 181 


vector space, 2 


